diff --git a/python-recipes/semantic-cache/04_langcache_semantic_caching.ipynb b/python-recipes/semantic-cache/04_langcache_semantic_caching.ipynb
new file mode 100644
index 00000000..98a84f5b
--- /dev/null
+++ b/python-recipes/semantic-cache/04_langcache_semantic_caching.ipynb
@@ -0,0 +1,1459 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)\n",
+        "\n",
+        "# LangCache: Semantic Caching with Redis Cloud\n",
+        "\n",
+        "This notebook demonstrates end-to-end semantic caching using **LangCache** - a managed Redis Cloud service accessed through the RedisVL library. LangCache provides enterprise-grade semantic caching with zero infrastructure management, making it ideal for production LLM applications.\n",
+        "\n",
+        "<a href=\"https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/semantic-cache/04_langcache_semantic_caching.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Introduction\n",
+        "\n",
+        "**LangCache** is a fully managed semantic cache service built on Redis Cloud. It was integrated into RedisVL in version 0.11.0 as an `LLMCache` interface implementation, making it easy for RedisVL users to:\n",
+        "\n",
+        "- Transition to a fully managed caching service\n",
+        "- Reduce LLM API costs by caching similar queries\n",
+        "- Improve application response times\n",
+        "- Access enterprise features without managing infrastructure\n",
+        "\n",
+        "### What You'll Learn\n",
+        "\n",
+        "In this tutorial, you will:\n",
+        "1. Set up LangCache with Redis Cloud\n",
+        "2. Load and process a knowledge base (PDF documents)\n",
+        "3. Generate FAQs using the Doc-to-Cache technique\n",
+        "4. Pre-populate a semantic cache with tagged FAQs\n",
+        "5. Test different cache matching strategies and thresholds\n",
+        "6. Integrate the cache into a RAG pipeline\n",
+        "7. Measure performance improvements\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## 1. Environment Setup\n",
+        "\n",
+        "First, we'll install the required packages and set up our environment.\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Install Required Packages\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "\n",
+            "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.3\u001b[0m\n",
+            "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
+            "Note: you may need to restart the kernel to use updated packages.\n",
+            "\n",
+            "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.3\u001b[0m\n",
+            "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
+            "Note: you may need to restart the kernel to use updated packages.\n"
+          ]
+        }
+      ],
+      "source": [
+        "%pip install -q \"redisvl>=0.11.0\" \"langcache\" \"sentence-transformers\"\n",
+        "%pip install -q \"pypdf\" \"openai>=1.0.0\" \"langchain>=0.3.0\" \"langchain-community\" \"langchain-openai\""
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Import Dependencies\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 2,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "import time\n",
+        "import json\n",
+        "from typing import List, Dict, Any\n",
+        "\n",
+        "# RedisVL imports\n",
+        "from redisvl.extensions.cache.llm import LangCacheSemanticCache"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## 2. LangCache setup\n",
+        "\n",
+        "### Sign up for LangCache\n",
+        "\n",
+        "If you haven't already, sign up for a free Redis Cloud account:\n",
+        "\n",
+        "**[Log in or sign up for Redis Cloud →](https://cloud.redis.io/#/)**\n",
+        "\n",
+        "After signing up:\n",
+        "1. Create a new database\n",
+        "2. Create a new LangCache service (Select 'LangCache' on the left menu bar)\n",
+        "3. Copy your **API Key**\n",
+        "4. Copy your **Cache ID**\n",
+        "5. Copy your **URL**\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Configure Environment Variables\n",
+        "You'll need the LangCache API Key, Cache ID, URL\n",
+        "You will also need access to an LLM. In this notebook we'll be using OpenAI"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Initialize Semantic Cache with LangCache-Embed Model\n",
+        "\n",
+        "We'll create a cache instance using the `redis/langcache-embed-v1` model, which is specifically optimized for semantic caching tasks.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "wy4ECQMIVUCcYGbZr_Lg007Cifh4GkgiIRNAf3S4ITMWQ4puuq-OStyjMvH-iD1m0oIB6hg5EVYQye5r1xajEFL7e0AUw5Gn_UEksTQdSm-Hwzu3wXsJJ4emhp8OopEJfHx6JnPlW36LDkCf6ne4Kj8CWiQkphQHqaEeKV9mdgbml-8qOv19AFr0y5vmTtkU_Xt5ByfGMTO-mI9wMKXNLOfwZixM1kiE8KAL_JM7dJN_EHQh\n",
+            "50eb6a09acf5415d8b68619b1ccffd9a\n",
+            "https://aws-us-east-1.langcache.redis.io\n"
+          ]
+        }
+      ],
+      "source": [
+        "langcache_api_key = os.environ.get('LANGCACHE_API_KEY') # found on your cloud console\n",
+        "langcache_id = os.environ.get('LANGCACHE_ID') # found on your cloud console\n",
+        "server_url = \"https://aws-us-east-1.langcache.redis.io\" # found on your cloud console\n",
+        "\n",
+        "\n",
+        "print(langcache_api_key)\n",
+        "print(langcache_id)\n",
+        "print(server_url)\n",
+        "\n",
+        "# Create Semantic Cache instance\n",
+        "cache = LangCacheSemanticCache(\n",
+        "    server_url=server_url,\n",
+        "    cache_id=langcache_id,\n",
+        "    api_key=langcache_api_key,\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "10:35:45 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "[]\n",
+            "10:35:45 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:35:45 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "[{'entry_id': '5eb63bbbe01eeed093cb22bb8f5acdc3', 'prompt': 'hello world', 'response': 'hello world from langcache', 'vector_distance': 0.07242219999999999, 'inserted_at': 0.0, 'updated_at': 0.0}]\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Check your cache is working\n",
+        "r = cache.check('hello world')\n",
+        "print(r) # should be empty on first run\n",
+        "\n",
+        "cache.store('hello world', 'hello world from langcache')\n",
+        "result = cache.check('hi world')\n",
+        "\n",
+        "print(result)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# RAG with semantic caching\n",
+        "\n",
+        "Now that we have a working semantic cache service running and we're connected to it, let's use it in an application.\n",
+        "\n",
+        "We'll build a simple Retrieval Augmented Generation (RAG) app using a PDF of NVidia's 2023 10k filing report.\n",
+        "\n",
+        "To get the full benefit of semantic caching we'll preload our cache with Frequently Asked Questions (FAQs) generated by an LLM about our PDF."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## 3. Generate FAQs Using Doc-to-Cache Technique\n",
+        "\n",
+        "The Doc-to-Cache approach uses an LLM to generate frequently asked questions from document chunks. These FAQs are then used to pre-populate the semantic cache with high-quality, factual responses.\n",
+        "\n",
+        "We'll work with three types of data:\n",
+        "1. **Knowledge Base**: PDF document(s) that contain factual information\n",
+        "2. **FAQs**: Derived from the knowledge base using Doc-to-Cache technique\n",
+        "3. **Test Dataset**: For evaluating and optimizing cache performance\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 5,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/Users/justin.cechmanek/.pyenv/versions/3.11.9/envs/redis-ai-res/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+            "  from .autonotebook import tqdm as notebook_tqdm\n"
+          ]
+        }
+      ],
+      "source": [
+        "# LangChain imports\n",
+        "from langchain_community.document_loaders import PyPDFLoader\n",
+        "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
+        "from langchain_openai import ChatOpenAI\n",
+        "from langchain_core.prompts import PromptTemplate, ChatPromptTemplate\n",
+        "from langchain_core.output_parsers import JsonOutputParser\n",
+        "\n",
+        "from pydantic import BaseModel, Field\n",
+        "import getpass"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "if \"OPENAI_API_KEY\" not in os.environ:\n",
+        "    os.environ[\"OPENAI_API_KEY\"] = getpass(\"Enter your OpenAI API key: \")\n",
+        "\n",
+        "# Initialize OpenAI LLM for FAQ generation and RAG\n",
+        "llm = ChatOpenAI(\n",
+        "    model=\"gpt-4o-mini\",\n",
+        "    temperature=0.3,\n",
+        "    max_tokens=2000\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Load PDF Knowledge Base\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 7,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# Download sample PDF if not already present\n",
+        "!mkdir -p data\n",
+        "!wget -q -O data/nvidia-10k.pdf https://raw.githubusercontent.com/redis-developer/redis-ai-resources/main/python-recipes/RAG/resources/nvd-10k-2023.pdf"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 8,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Loaded PDF: data/nvidia-10k.pdf\n",
+            "  Total pages: 169\n",
+            "  Created chunks: 388\n",
+            "\n",
+            "Sample chunk preview:\n",
+            "Table of Contents\n",
+            "The world’s leading cloud service providers, or CSPs, and consumer internet companies use our GPUs and broader data center-scale\n",
+            "accelerated computing platforms to enable, accelerate or enrich the services they deliver to billions of end-users, including search,\n",
+            "recommendations, so...\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Load and chunk the PDF\n",
+        "pdf_path = \"data/nvidia-10k.pdf\"\n",
+        "\n",
+        "# Configure text splitter for optimal chunk sizes\n",
+        "text_splitter = RecursiveCharacterTextSplitter(\n",
+        "    chunk_size=2000,\n",
+        "    chunk_overlap=200,\n",
+        "    separators=[\"\\n\\n\", \"\\n\", \". \", \" \", \"\"]\n",
+        ")\n",
+        "\n",
+        "# Load and split the document\n",
+        "loader = PyPDFLoader(pdf_path)\n",
+        "documents = loader.load()\n",
+        "chunks = text_splitter.split_documents(documents)\n",
+        "\n",
+        "print(f\"Loaded PDF: {pdf_path}\")\n",
+        "print(f\"  Total pages: {len(documents)}\")\n",
+        "print(f\"  Created chunks: {len(chunks)}\")\n",
+        "print(f\"\\nSample chunk preview:\")\n",
+        "print(f\"{chunks[10].page_content[:300]}...\")\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# Define the FAQ data model\n",
+        "class QuestionAnswer(BaseModel):\n",
+        "    question: str = Field(description=\"A frequently asked question derived from the document content\")\n",
+        "    answer: str = Field(description=\"A factual answer to the question based on the document\")\n",
+        "    category: str = Field(description=\"Category of the question (e.g., 'financial', 'products', 'operations')\")\n",
+        "\n",
+        "class FAQList(BaseModel):\n",
+        "    faqs: List[QuestionAnswer] = Field(description=\"List of question-answer pairs extracted from the document\")\n",
+        "\n",
+        "# Set up JSON output parser\n",
+        "json_parser = JsonOutputParser(pydantic_object=FAQList)\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 10,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "FAQ generation chain configured\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Create the FAQ generation prompt\n",
+        "faq_prompt = PromptTemplate(\n",
+        "    template=\"\"\"You are a document analysis expert. Extract 3-5 high-quality FAQs from the following document chunk.\n",
+        "\n",
+        "Guidelines:\n",
+        "- Generate diverse, specific questions that users would realistically ask\n",
+        "- Provide accurate, complete answers based ONLY on the document content\n",
+        "- Assign each FAQ to a category: 'financial', 'products', 'operations', 'technology', or 'general'\n",
+        "- Avoid vague or overly generic questions\n",
+        "- If the chunk lacks substantial content, return fewer FAQs\n",
+        "\n",
+        "{format_instructions}\n",
+        "\n",
+        "Document Chunk:\n",
+        "{doc_content}\n",
+        "\n",
+        "FAQs JSON:\"\"\",\n",
+        "    input_variables=[\"doc_content\"],\n",
+        "    partial_variables={\"format_instructions\": json_parser.get_format_instructions()}\n",
+        ")\n",
+        "\n",
+        "# Create the FAQ generation chain\n",
+        "faq_chain = faq_prompt | llm | json_parser\n",
+        "\n",
+        "print(\"FAQ generation chain configured\")\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 11,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Testing FAQ generation on sample chunk...\n",
+            "\n",
+            "10:36:14 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "Generated 5 FAQs:\n",
+            "\n",
+            "1. Q: What industries are leveraging NVIDIA's GPUs for automation?\n",
+            "   Category: operations\n",
+            "   A: A rapidly growing number of enterprises and startups across a broad range of industries, including transportation for autonomous driving, healthcare f...\n",
+            "\n",
+            "2. Q: What was the reason for the termination of the Arm Share Purchase Agreement?\n",
+            "   Category: general\n",
+            "   A: The Share Purchase Agreement between NVIDIA and SoftBank Group Corp. was terminated due to significant regulatory challenges that prevented the comple...\n",
+            "\n",
+            "3. Q: What are some applications of NVIDIA's GPUs in professional design?\n",
+            "   Category: products\n",
+            "   A: Professional designers use NVIDIA's GPUs and software to create visual effects in movies and to design buildings and products ranging from cell phones...\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Test FAQ generation on a single chunk\n",
+        "print(\"Testing FAQ generation on sample chunk...\\n\")\n",
+        "test_faqs = faq_chain.invoke({\"doc_content\": chunks[10].page_content})\n",
+        "\n",
+        "print(f\"Generated {len(test_faqs.get('faqs', []))} FAQs:\")\n",
+        "for i, faq in enumerate(test_faqs.get('faqs', [])[:3], 1):\n",
+        "    print(f\"\\n{i}. Q: {faq['question']}\")\n",
+        "    print(f\"   Category: {faq['category']}\")\n",
+        "    print(f\"   A: {faq['answer'][:150]}...\")\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 12,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "\n",
+            "Generating FAQs from document chunks...\n",
+            "\n",
+            "Processing chunk 1/25...\n",
+            "10:36:23 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:36:34 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:36:43 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:36:52 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:36:58 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "Processing chunk 6/25...\n",
+            "10:37:10 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:37:14 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:37:21 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:37:29 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:37:38 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "Processing chunk 11/25...\n",
+            "10:37:49 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:37:57 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:38:06 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:38:14 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:38:23 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "Processing chunk 16/25...\n",
+            "10:38:32 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:38:39 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:38:57 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:39:05 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:39:15 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "Processing chunk 21/25...\n",
+            "10:39:23 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:39:31 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:39:40 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:39:47 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "10:39:54 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "\n",
+            "Generated 113 FAQs total\n",
+            "\n",
+            "Category distribution:\n",
+            "  technology: 29\n",
+            "  products: 27\n",
+            "  financial: 20\n",
+            "  operations: 20\n",
+            "  general: 17\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Generate FAQs from all chunks (limited to first 25 for demo purposes)\n",
+        "def extract_faqs_from_chunks(chunks: List[Any], max_chunks: int = 25) -> List[Dict]:\n",
+        "    \"\"\"Extract FAQs from document chunks using LLM.\n",
+        "        \n",
+        "        chunks: list of document chunks\n",
+        "        max_chunks: maximum number of chunks to process\n",
+        "        \n",
+        "        Returns: A list of question-answer pairs\n",
+        "    \"\"\"\n",
+        "    all_faqs = []\n",
+        "\n",
+        "    for i, chunk in enumerate(chunks[:max_chunks]):\n",
+        "        if i % 5 == 0:\n",
+        "            print(f\"Processing chunk {i+1}/{min(len(chunks), max_chunks)}...\", flush=True)\n",
+        "\n",
+        "        try:\n",
+        "            result = faq_chain.invoke({\"doc_content\": chunk.page_content})\n",
+        "            if result and result.get(\"faqs\"):\n",
+        "                all_faqs.extend(result[\"faqs\"])\n",
+        "        except Exception as e:\n",
+        "            print(f\"  Warning: Skipped chunk {i+1} due to error: {str(e)[:100]}\")\n",
+        "            continue\n",
+        "\n",
+        "    return all_faqs\n",
+        "\n",
+        "# Extract FAQs\n",
+        "print(\"\\nGenerating FAQs from document chunks...\\n\")\n",
+        "faqs = extract_faqs_from_chunks(chunks, max_chunks=25)\n",
+        "\n",
+        "print(f\"\\nGenerated {len(faqs)} FAQs total\")\n",
+        "print(f\"\\nCategory distribution:\")\n",
+        "categories = {}\n",
+        "for faq in faqs:\n",
+        "    cat = faq.get('category', 'unknown')\n",
+        "    categories[cat] = categories.get(cat, 0) + 1\n",
+        "for cat, count in sorted(categories.items(), key=lambda x: x[1], reverse=True):\n",
+        "    print(f\"  {cat}: {count}\")\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## 4. Pre-load semantic cache with FAQs\n",
+        "\n",
+        "Now we'll populate the cache instance with our generated FAQs. We'll use the `store()` API with metadata tags for filtering and organization.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 13,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Storing FAQs in cache...\n",
+            "\n",
+            "  Stored 0/113 FAQs...\n",
+            "10:39:54 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:54 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:54 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:54 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:55 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:55 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:55 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:55 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:55 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:55 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:55 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:56 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:56 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:56 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:56 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:56 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:56 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:56 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:56 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:57 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "  Stored 20/113 FAQs...\n",
+            "10:39:57 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:57 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:57 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:57 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:57 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:57 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:57 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:58 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:58 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:58 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:58 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:58 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:58 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:58 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:58 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:59 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:59 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:59 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:59 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:59 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "  Stored 40/113 FAQs...\n",
+            "10:39:59 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:59 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:39:59 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:00 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:00 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:00 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:00 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:00 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:00 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:00 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:00 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:01 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:01 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:01 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:01 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:01 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:01 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:01 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:01 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:02 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "  Stored 60/113 FAQs...\n",
+            "10:40:02 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:02 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:02 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:02 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:02 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:02 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:02 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:03 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:03 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:03 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:03 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:03 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:03 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:03 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:03 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:04 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:04 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:04 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:04 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:04 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "  Stored 80/113 FAQs...\n",
+            "10:40:04 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:04 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:04 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "  Stored 100/113 FAQs...\n",
+            "10:40:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "10:40:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries \"HTTP/1.1 201 Created\"\n",
+            "\n",
+            "Stored 113 FAQs in cache\n",
+            "\n",
+            "Example cache entries:\n",
+            "\n",
+            "1. Key: eb461a36940c04a1d307d33a595188af\n",
+            "   Q: What is the fiscal year end date for NVIDIA Corporation as reported in the Form 10-K?...\n",
+            "\n",
+            "2. Key: 0daa2589e67ab291d447f3e103435706\n",
+            "   Q: What is the trading symbol for NVIDIA Corporation's common stock?...\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Store FAQs in cache with metadata tags\n",
+        "print(\"Storing FAQs in cache...\\n\")\n",
+        "\n",
+        "stored_count = 0\n",
+        "cache_keys = {}  # Map questions to their cache keys\n",
+        "\n",
+        "for i, faq in enumerate(faqs):\n",
+        "    if i % 20 == 0:\n",
+        "        print(f\"  Stored {i}/{len(faqs)} FAQs...\", flush=True)\n",
+        "\n",
+        "    try:\n",
+        "        # Store with metadata - note that metadata is stored but not used for filtering in basic SemanticCache\n",
+        "        key = cache.store(prompt=faq['question'], response=faq['answer'], metadata={'category': faq['category']})\n",
+        "        cache_keys[faq['question']] = key\n",
+        "        stored_count += 1\n",
+        "    except Exception as e:\n",
+        "        print(f\"  Warning: Failed to store FAQ {i+1}: {str(e)[:100]}\")\n",
+        "\n",
+        "print(f\"\\nStored {stored_count} FAQs in cache\")\n",
+        "\n",
+        "print(f\"\\nExample cache entries:\")\n",
+        "for i, (q, k) in enumerate(list(cache_keys.items())[:2], 1):\n",
+        "    print(f\"\\n{i}. Key: {k}\")\n",
+        "    print(f\"   Q: {q[:150]}...\")\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## 5. Evaluating our semantic cache\n",
+        "Now that we have a semantic cache populated with question answer pairs we can evaluate its effectiveness.\n",
+        "\n",
+        "Unlike standard caching that uses exact key:value look ups, semantic caches relies on the notion of semantic embedding similarity.\n",
+        "The benefits of semantic matching are that similar questions such as, \"who is the king of England?\", and, \"who is the monarch of Britain?\" can be matched together.\n",
+        "This flexibility comes at the cost of occasional mismatches. A question like, \"who is the queen of England?\" is also similar and likely to match.\n",
+        "\n",
+        "Let's create a dataset of test questions to see which match and which don't to evaluate our cache hit rate, and our accuracy.\n",
+        "### Create test/evaluation dataset\n",
+        "\n",
+        "We'll create a test dataset with:\n",
+        "- **Positive examples**: Questions that should match cached FAQs\n",
+        "- **Negative examples**: Questions that should NOT match cached FAQs\n",
+        "- **Edge cases**: Slightly different phrasings to test threshold sensitivity\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Test dataset created\n",
+            "  Positive examples: 5\n",
+            "  Negative examples: 5\n",
+            "  Edge cases: 5\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Create test dataset with positive examples (should match NVIDIA FAQs). We'll take the first 5 from our generated FAQs and modify them slightly.\n",
+        "positive_examples = [\n",
+        "{'query': \"What's the fiscal year end for NVIDIA Corporation?\",\n",
+        " 'expected_answer': 'The fiscal year ended January 29, 2023.',\n",
+        " 'category': 'general',\n",
+        " 'expected_match': True} ,\n",
+        "{'query': \"What is the trading symbol of NVIDIA Corporation's common stock in the market?\",\n",
+        " 'expected_answer': \"The trading symbol for NVIDIA Corporation's common stock is NVDA.\",\n",
+        " 'category': 'financial' ,\n",
+        " 'expected_match': True} ,\n",
+        "{'query': 'Where is the location of the executive office?',\n",
+        " 'expected_answer': 'The principal executive office of NVIDIA Corporation is located at 2788 San Tomas Expressway, Santa Clara, California 95051.',\n",
+        " 'category': 'operations' ,\n",
+        " 'expected_match': True} ,\n",
+        "{'query': 'Does the SEC consider NVIDIA Corporation a well-known seasoned issuer?',\n",
+        " 'expected_answer': 'No, NVIDIA Corporation is not considered a well-known seasoned issuer as indicated by the check mark in the document.',\n",
+        " 'category': 'financial' ,\n",
+        " 'expected_match': True} ,\n",
+        "{'query': \"In what exchange platform is NVIDIA's stock traded in?\",\n",
+        " 'expected_answer': \"NVIDIA Corporation's common stock is registered on The Nasdaq Global Select Market.\",\n",
+        " 'category': 'financial' ,\n",
+        " 'expected_match': True} ,\n",
+        "]\n",
+        "\n",
+        "# Create test dataset with negative examples\n",
+        "negative_examples = [\n",
+        "    {\"query\": \"Where are these reports being submitted and who is reading them?\", \"expected_match\": False, \"category\": \"off-topic\"},\n",
+        "    {\"query\": \"What is Jensen Huang's net worth?\", \"expected_match\": False, \"category\": \"off-topic\"},\n",
+        "    {\"query\": \"What games run best on the RTX 4090? NVIDIA GPU?\", \"expected_match\": False, \"category\": \"off-topic\"},\n",
+        "    {\"query\": \"What time is it?\", \"expected_match\": False, \"category\": \"off-topic\"},\n",
+        "    {\"query\": \"Should I invest my life savings in this organization?\", \"expected_match\": False, \"category\": \"general\"},\n",
+        "]\n",
+        "\n",
+        "# Create test dataset with edge cases (slightly different phrasings)\n",
+        "edge_cases = [\n",
+        "    {\"query\": \"What's the fiscal year end for Microsoft Corporation?\", \"expected_match\": False, \"category\": \"general\"},\n",
+        "    {\"query\": \"What is the company total revenue for the last 5 years?\", \"expected_match\": False, \"category\": \"financial\"},\n",
+        "    {\"query\": \"What's the location of the manufacturing plant for NVIDIA?\", \"expected_match\": False, \"category\": \"general\"},\n",
+        "    {\"query\": \"Where are the locations of each office of NVIDIA Corporation?\", \"expected_match\": False, \"category\": \"off-topic\"},\n",
+        "    {\"query\": \"What is the trading symbold of NVIDIA Corporation on the Japan exchange?\", \"expected_match\": False, \"category\": \"general\"},\n",
+        "]\n",
+        "\n",
+        "print(f\"Test dataset created\")\n",
+        "print(f\"  Positive examples: {len(positive_examples)}\")\n",
+        "print(f\"  Negative examples: {len(negative_examples)}\")\n",
+        "print(f\"  Edge cases: {len(edge_cases)}\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Test semantic similarity\n",
+        "Let's test how the cache performs with different types of queries and matching thresholds.\n",
+        "\n",
+        "We'll run through our 15 sample questions and track which ones get a hit and which ones don't. We'll also track if they should have hit.\n",
+        "\n",
+        "This will give us a baseline of our cache performance."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 22,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Testing semantic similarity:\n",
+            "\n",
+            "10:41:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "0. Cache HIT (distance: 0.0842)\n",
+            "   Original query: What's the fiscal year end for NVIDIA Corporation?\n",
+            "   Matched: What is the fiscal year end date for NVIDIA Corporation as reported in the Form ...\n",
+            "10:41:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "1. Cache HIT (distance: 0.0174)\n",
+            "   Original query: What is the trading symbol of NVIDIA Corporation's common stock in the market?\n",
+            "   Matched: What is the trading symbol for NVIDIA Corporation's common stock?...\n",
+            "10:41:05 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "2. Cache MISS\n",
+            "   Original query: Where is the location of the executive office?\n",
+            "   Expected match: True\n",
+            "10:41:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "3. Cache HIT (distance: 0.0352)\n",
+            "   Original query: Does the SEC consider NVIDIA Corporation a well-known seasoned issuer?\n",
+            "   Matched: Is NVIDIA Corporation considered a well-known seasoned issuer?...\n",
+            "10:41:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "4. Cache HIT (distance: 0.1314)\n",
+            "   Original query: In what exchange platform is NVIDIA's stock traded in?\n",
+            "   Matched: What is the trading symbol for NVIDIA Corporation's common stock?...\n",
+            "10:41:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "5. Cache MISS\n",
+            "10:41:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "6. Cache MISS\n",
+            "10:41:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "7. Cache MISS\n",
+            "10:41:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "8. Cache MISS\n",
+            "10:41:06 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "9. Cache MISS\n",
+            "10:41:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10. Cache MISS\n",
+            "10:41:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "11. Cache HIT (distance: 0.1066)\n",
+            "   Original query: What's the location of the manufacturing plant for NVIDIA?\n",
+            "   MISS MATCHED: Where is NVIDIA headquartered?...\n",
+            "10:41:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "12. Cache HIT (distance: 0.0973)\n",
+            "   Original query: What date does NVIDIA use as it's year end for acounting purposes?\n",
+            "   Matched: When do NVIDIA's consumer products typically see stronger revenue?...\n",
+            "10:41:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "13. Cache HIT (distance: 0.0480)\n",
+            "   Original query: Where are the locations of each office of NVIDIA Corporation?\n",
+            "   MISS MATCHED: Where is the principal executive office of NVIDIA Corporation located?...\n",
+            "10:41:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "14. Cache HIT (distance: 0.0870)\n",
+            "   Original query: What is the trading symbold of NVIDIA Corporation on the NASDAQ exchange?\n",
+            "   Matched: What is the trading symbol for NVIDIA Corporation's common stock?...\n",
+            "\n",
+            "Summary Metrics:\n",
+            "  Accuracy: 80.000%\n",
+            "  Precision: 75.000%\n",
+            "  Recall: 85.714%\n",
+            "  F1 Score: 80.000%\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Test with semantically similar queries\n",
+        "print(\"Testing semantic similarity:\\n\")\n",
+        "\n",
+        "full_test_data = positive_examples + negative_examples + edge_cases\n",
+        "\n",
+        "# Track our metrics\n",
+        "true_positives = 0 # we have a hit and it hould match\n",
+        "false_positives = 0 # we have a hit and it SHOULD NOT match\n",
+        "false_negatives = 0 # we have a miss and it SHOULD match\n",
+        "true_negatives = 0 # we have a miss and it SHOULD NOT match\n",
+        "\n",
+        "for i, question in enumerate(full_test_data):\n",
+        "    result = cache.check(prompt=question['query'], return_fields=[\"prompt\", \"response\", \"distance\"])\n",
+        "\n",
+        "    if result and question['expected_match']:\n",
+        "        true_positives += 1\n",
+        "        print(f\"{i}. Cache HIT (distance: {result[0].get('vector_distance', 'N/A'):.4f})\")\n",
+        "        print(f\"   Original query: {question['query']}\")\n",
+        "        print(f\"   Matched: {result[0]['prompt'][:80]}...\")\n",
+        "    elif result and not question['expected_match']:\n",
+        "        false_positives += 1\n",
+        "        print(f\"{i}. Cache HIT (distance: {result[0].get('vector_distance', 'N/A'):.4f})\")\n",
+        "        print(f\"   Original query: {question['query']}\")\n",
+        "        print(f\"   MISS MATCHED: {result[0]['prompt'][:80]}...\")\n",
+        "    elif not result and question['expected_match']:\n",
+        "        false_negatives += 1\n",
+        "        print(f\"{i}. Cache MISS\")\n",
+        "        print(f\"   Original query: {question['query']}\")\n",
+        "        print(f\"   Expected match: {question['expected_match']}\")\n",
+        "    elif not result and not question['expected_match']:\n",
+        "        true_negatives += 1\n",
+        "        print(f\"{i}. Cache MISS\")\n",
+        "\n",
+        "# Calculate our summary metrics\n",
+        "accuracy = (true_positives + true_negatives) / len(full_test_data)\n",
+        "precision = true_positives / (true_positives + false_positives)\n",
+        "recall = true_positives / (true_positives + false_negatives)\n",
+        "f1_score = 2 * (precision * recall) / (precision + recall)\n",
+        "\n",
+        "print(f\"\\nSummary Metrics:\")\n",
+        "print(f\"  Accuracy: {100*accuracy:.3f}%\")\n",
+        "print(f\"  Precision: {100*precision:.3f}%\")\n",
+        "print(f\"  Recall: {100*recall:.3f}%\")\n",
+        "print(f\"  F1 Score: {100*f1_score:.3f}%\")\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## 6. Tune cache threshold\n",
+        "\n",
+        "Using sample questions, we can find the optimal distance threshold based on our test dataset."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 23,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Testing cache performance across different similarity thresholds...\n",
+            "10:41:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:07 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:08 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:09 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:09 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:09 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "10:41:09 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "THRESHOLD OPTIMIZATION RESULTS\n",
+            "====================================================================================================\n",
+            "\n",
+            "Performance Metrics by Threshold:\n",
+            " Threshold  Total Hits  Total Misses  True Positives  False Positives  True Negatives  False Negatives  Precision   Recall  F1 Score  Accuracy\n",
+            "      0.20           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "      0.30           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "      0.40           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "      0.50           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "      0.60           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "      0.70           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "      0.80           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "      0.85           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "      0.90           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "      0.95           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "      1.00           8             7               6                2               6                1       0.75 0.857143       0.8       0.8\n",
+            "\n",
+            "====================================================================================================\n",
+            "OPTIMAL THRESHOLD: 0.2\n",
+            "  F1 Score: 0.800\n",
+            "  Precision: 0.750\n",
+            "  Recall: 0.857\n",
+            "  Accuracy: 0.800\n",
+            "====================================================================================================\n",
+            "\n",
+            "Detailed breakdown at optimal threshold (0.2):\n",
+            "\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Test a range of different cache similarity thresholds\n",
+        "import pandas as pd\n",
+        "\n",
+        "# Define threshold ranges to test\n",
+        "thresholds_to_test = [0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.85, 0.90, 0.95, 1.00]\n",
+        "\n",
+        "print(\"Testing cache performance across different similarity thresholds...\")\n",
+        "\n",
+        "# Store results for all queries to reuse across thresholds\n",
+        "query_results = []\n",
+        "for test_case in full_test_data:\n",
+        "    result = cache.check(prompt=test_case['query'], return_fields=[\"prompt\", \"response\", \"vector_distance\", \"entry_id\"])\n",
+        "\n",
+        "    query_results.append({\n",
+        "        'query': test_case['query'],\n",
+        "        'expected_match': test_case['expected_match'],\n",
+        "        'cache_result': result[0] if result else None,\n",
+        "        'distance': result[0].get('vector_distance') if result else float('inf')\n",
+        "    })\n",
+        "\n",
+        "# Evaluate each threshold\n",
+        "results = []\n",
+        "\n",
+        "for threshold in thresholds_to_test:\n",
+        "    true_positives = 0\n",
+        "    false_positives = 0\n",
+        "    true_negatives = 0\n",
+        "    false_negatives = 0\n",
+        "\n",
+        "    for query_data in query_results:\n",
+        "        # Determine if this would be a cache hit at this threshold\n",
+        "        is_cache_hit = query_data['distance'] < threshold\n",
+        "        should_match = query_data['expected_match']\n",
+        "\n",
+        "        if is_cache_hit and should_match:\n",
+        "            true_positives += 1\n",
+        "        elif is_cache_hit and not should_match:\n",
+        "            false_positives += 1\n",
+        "        elif not is_cache_hit and not should_match:\n",
+        "            true_negatives += 1\n",
+        "        elif not is_cache_hit and should_match:\n",
+        "            false_negatives += 1\n",
+        "\n",
+        "    # Calculate metrics\n",
+        "    total_hits = true_positives + false_positives\n",
+        "    total_misses = true_negatives + false_negatives\n",
+        "\n",
+        "    precision = true_positives / (true_positives + false_positives) if (true_positives + false_positives) > 0 else 0\n",
+        "    recall = true_positives / (true_positives + false_negatives) if (true_positives + false_negatives) > 0 else 0\n",
+        "    f1_score = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0\n",
+        "    accuracy = (true_positives + true_negatives) / len(full_test_data)\n",
+        "\n",
+        "    results.append({\n",
+        "        'Threshold': threshold,\n",
+        "        'Total Hits': total_hits,\n",
+        "        'Total Misses': total_misses,\n",
+        "        'True Positives': true_positives,\n",
+        "        'False Positives': false_positives,\n",
+        "        'True Negatives': true_negatives,\n",
+        "        'False Negatives': false_negatives,\n",
+        "        'Precision': precision,\n",
+        "        'Recall': recall,\n",
+        "        'F1 Score': f1_score,\n",
+        "        'Accuracy': accuracy\n",
+        "    })\n",
+        "\n",
+        "# Display results in a formatted table\n",
+        "df_results = pd.DataFrame(results)\n",
+        "\n",
+        "print(\"THRESHOLD OPTIMIZATION RESULTS\")\n",
+        "print(\"=\"*100)\n",
+        "print(\"\\nPerformance Metrics by Threshold:\")\n",
+        "print(df_results.to_string(index=False))\n",
+        "\n",
+        "# Find optimal threshold based on F1 score\n",
+        "optimal_idx = df_results['F1 Score'].idxmax()\n",
+        "optimal_threshold = df_results.loc[optimal_idx, 'Threshold']\n",
+        "optimal_f1 = df_results.loc[optimal_idx, 'F1 Score']\n",
+        "\n",
+        "print(\"\\n\" + \"=\"*100)\n",
+        "print(f\"OPTIMAL THRESHOLD: {optimal_threshold}\")\n",
+        "print(f\"  F1 Score: {optimal_f1:.3f}\")\n",
+        "print(f\"  Precision: {df_results.loc[optimal_idx, 'Precision']:.3f}\")\n",
+        "print(f\"  Recall: {df_results.loc[optimal_idx, 'Recall']:.3f}\")\n",
+        "print(f\"  Accuracy: {df_results.loc[optimal_idx, 'Accuracy']:.3f}\")\n",
+        "print(\"=\"*100)\n",
+        "\n",
+        "# Show detailed breakdown for optimal threshold\n",
+        "print(f\"\\nDetailed breakdown at optimal threshold ({optimal_threshold}):\\n\")\n",
+        "for query_data in query_results:\n",
+        "    is_cache_hit = query_data['distance'] < optimal_threshold\n",
+        "    should_match = query_data['expected_match']\n",
+        "\n",
+        "    status = \"\"\n",
+        "    if is_cache_hit and should_match:\n",
+        "        status = \"✓ TP (True Positive)\"\n",
+        "    elif is_cache_hit and not should_match:\n",
+        "        status = \"✗ FP (False Positive)\"\n",
+        "    elif not is_cache_hit and not should_match:\n",
+        "        status = \"✓ TN (True Negative)\"\n",
+        "    elif not is_cache_hit and should_match:\n",
+        "        status = \"✗ FN (False Negative)\"\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## 7. RAG pipeline integration\n",
+        "\n",
+        "Now let's integrate the semantic cache into a complete RAG pipeline and measure the performance improvements."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Build a simple RAG chain\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 17,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "RAG chain created\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Create a simple RAG prompt template\n",
+        "rag_template = ChatPromptTemplate.from_messages([\n",
+        "    (\"system\", \"You are a helpful assistant answering questions about NVIDIA based on their 10-K filing. Provide accurate, concise answers.\"),\n",
+        "    (\"user\", \"{question}\")\n",
+        "])\n",
+        "\n",
+        "# Create RAG chain\n",
+        "rag_chain = rag_template | llm\n",
+        "\n",
+        "print(\"RAG chain created\")\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Create cached RAG function\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 18,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Cached RAG function ready\n"
+          ]
+        }
+      ],
+      "source": [
+        "def rag_with_cache(question: str, use_cache: bool = True) -> tuple:\n",
+        "    \"\"\"\n",
+        "    Process a question through RAG pipeline with optional semantic caching.\n",
+        "\n",
+        "    Returns: A tuple of (answer, cache_hit, response_time)\n",
+        "    \"\"\"\n",
+        "    start_time = time.time()\n",
+        "    cache_hit = False\n",
+        "\n",
+        "    # Check cache first if enabled\n",
+        "    if use_cache:\n",
+        "        cached_result = cache.check(prompt=question, distance_threshold=optimal_threshold)\n",
+        "        if cached_result:\n",
+        "            answer = cached_result[0]['response']\n",
+        "            cache_hit = True\n",
+        "            response_time = time.time() - start_time\n",
+        "            return answer, cache_hit, response_time\n",
+        "\n",
+        "    # Cache miss - use LLM\n",
+        "    answer = rag_chain.invoke({\"question\": question})\n",
+        "    response_time = time.time() - start_time\n",
+        "\n",
+        "    # Store in cache for future use\n",
+        "    if use_cache and hasattr(answer, 'content'):\n",
+        "        cache.store(prompt=question, response=answer.content)\n",
+        "    elif use_cache:\n",
+        "        cache.store(prompt=question, response=str(answer))\n",
+        "\n",
+        "    return answer.content if hasattr(answer, 'content') else str(answer), cache_hit, response_time\n",
+        "\n",
+        "print(\"Cached RAG function ready\")\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Performance comparison: with vs without cache\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 19,
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "\n",
+            "================================================================================\n",
+            "PERFORMANCE COMPARISON: With Cache vs Without Cache\n",
+            "================================================================================\n",
+            "\n",
+            "[FIRST PASS - Populating Cache]\n",
+            "\n",
+            "10:40:12 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "1. What is NVIDIA's primary business?\n",
+            "   Cache: HIT | Time: 0.116s\n",
+            "   Answer: NVIDIA has expanded into several large and important computationally intensive fields including scie...\n",
+            "\n",
+            "10:40:13 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "2. How much revenue did NVIDIA generate?\n",
+            "   Cache: HIT | Time: 0.120s\n",
+            "   Answer: NVIDIA's consumer products usually see stronger revenue in the second half of their fiscal year, wit...\n",
+            "\n",
+            "10:40:13 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "3. What are NVIDIA's main products?\n",
+            "   Cache: HIT | Time: 0.125s\n",
+            "   Answer: NVIDIA specializes in four large markets: Data Center, Gaming, Professional Visualization, and Autom...\n",
+            "\n",
+            "\n",
+            "[SECOND PASS - Cache Hits with Paraphrased Questions]\n",
+            "\n",
+            "10:40:13 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "1. What does NVIDIA do as a business?\n",
+            "   Cache: HIT ✓ | Time: 0.122s\n",
+            "   Answer: NVIDIA's business has evolved from a primary focus on gaming products to broader markets, including ...\n",
+            "\n",
+            "10:40:13 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "2. Can you tell me NVIDIA's revenue figures?\n",
+            "   Cache: HIT ✓ | Time: 0.119s\n",
+            "   Answer: NVIDIA announces material financial information to investors through its investor relations website,...\n",
+            "\n",
+            "10:40:13 httpx INFO   HTTP Request: POST https://aws-us-east-1.langcache.redis.io/v1/caches/50eb6a09acf5415d8b68619b1ccffd9a/entries/search \"HTTP/1.1 200 OK\"\n",
+            "3. What products does NVIDIA sell?\n",
+            "   Cache: HIT ✓ | Time: 0.125s\n",
+            "   Answer: NVIDIA's Graphics segment includes GeForce GPUs for gaming and PCs, the GeForce NOW game streaming s...\n",
+            "\n",
+            "\n",
+            "[THIRD PASS - Without Cache (Baseline)]\n",
+            "\n",
+            "10:40:15 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "1. What is NVIDIA's primary business?\n",
+            "   Cache: DISABLED | Time: 1.640s\n",
+            "\n",
+            "10:40:17 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "2. How much revenue did NVIDIA generate?\n",
+            "   Cache: DISABLED | Time: 2.014s\n",
+            "\n",
+            "10:40:21 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+            "3. What are NVIDIA's main products?\n",
+            "   Cache: DISABLED | Time: 4.001s\n",
+            "\n",
+            "\n",
+            "================================================================================\n",
+            "PERFORMANCE SUMMARY\n",
+            "================================================================================\n",
+            "Average time - First pass (cache miss):  0.120s\n",
+            "Average time - Second pass (cache hit):  0.122s\n",
+            "Average time - Without cache:            2.552s\n",
+            "\n",
+            "Speedup with cache: 1.0x faster\n",
+            "  Cache hit rate: 0%\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Test questions for RAG evaluation\n",
+        "test_questions_rag = [\n",
+        "    \"What is NVIDIA's primary business?\",\n",
+        "    \"How much revenue did NVIDIA generate?\",\n",
+        "    \"What are NVIDIA's main products?\",\n",
+        "]\n",
+        "\n",
+        "print(\"\\n\" + \"=\"*80)\n",
+        "print(\"PERFORMANCE COMPARISON: With Cache vs Without Cache\")\n",
+        "print(\"=\"*80)\n",
+        "\n",
+        "# First pass - populate cache (cache misses, must call LLM)\n",
+        "print(\"\\n[FIRST PASS - Populating Cache]\\n\")\n",
+        "first_pass_times = []\n",
+        "\n",
+        "for i, question in enumerate(test_questions_rag, 1):\n",
+        "    answer, cache_hit, response_time = rag_with_cache(question, use_cache=True)\n",
+        "    first_pass_times.append(response_time)\n",
+        "    print(f\"{i}. {question}\")\n",
+        "    print(f\"   Cache: {'HIT' if cache_hit else 'MISS'} | Time: {response_time:.3f}s\")\n",
+        "    print(f\"   Answer: {answer[:100]}...\\n\")\n",
+        "\n",
+        "# Second pass - test cache hits with similar questions\n",
+        "print(\"\\n[SECOND PASS - Cache Hits with Paraphrased Questions]\\n\")\n",
+        "second_pass_times = []\n",
+        "\n",
+        "similar_questions = [\n",
+        "    \"What does NVIDIA do as a business?\",\n",
+        "    \"Can you tell me NVIDIA's revenue figures?\",\n",
+        "    \"What products does NVIDIA sell?\",\n",
+        "]\n",
+        "\n",
+        "for i, question in enumerate(similar_questions, 1):\n",
+        "    answer, cache_hit, response_time = rag_with_cache(question, use_cache=True)\n",
+        "    second_pass_times.append(response_time)\n",
+        "    print(f\"{i}. {question}\")\n",
+        "    print(f\"   Cache: {'HIT ✓' if cache_hit else 'MISS ✗'} | Time: {response_time:.3f}s\")\n",
+        "    print(f\"   Answer: {answer[:100]}...\\n\")\n",
+        "\n",
+        "# Third pass - without cache (baseline)\n",
+        "print(\"\\n[THIRD PASS - Without Cache (Baseline)]\\n\")\n",
+        "no_cache_times = []\n",
+        "\n",
+        "for i, question in enumerate(test_questions_rag, 1):\n",
+        "    answer, _, response_time = rag_with_cache(question, use_cache=False)\n",
+        "    no_cache_times.append(response_time)\n",
+        "    print(f\"{i}. {question}\")\n",
+        "    print(f\"   Cache: DISABLED | Time: {response_time:.3f}s\\n\")\n",
+        "\n",
+        "# Summary\n",
+        "print(\"\\n\" + \"=\"*80)\n",
+        "print(\"PERFORMANCE SUMMARY\")\n",
+        "print(\"=\"*80)\n",
+        "avg_first = sum(first_pass_times)/len(first_pass_times)\n",
+        "avg_second = sum(second_pass_times)/len(second_pass_times)\n",
+        "avg_no_cache = sum(no_cache_times)/len(no_cache_times)\n",
+        "\n",
+        "print(f\"Average time - First pass (cache miss):  {avg_first:.3f}s\")\n",
+        "print(f\"Average time - Second pass (cache hit):  {avg_second:.3f}s\")\n",
+        "print(f\"Average time - Without cache:            {avg_no_cache:.3f}s\")\n",
+        "\n",
+        "if avg_second > 0:\n",
+        "    speedup = avg_first / avg_second\n",
+        "    print(f\"\\nSpeedup with cache: {speedup:.1f}x faster\")\n",
+        "\n",
+        "cache_hit_count = sum(1 for i, _ in enumerate(similar_questions) if second_pass_times[i] < 0.1)\n",
+        "cache_hit_rate = cache_hit_count / len(similar_questions)\n",
+        "print(f\"  Cache hit rate: {cache_hit_rate*100:.0f}%\")\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## 8. Best Practices and Tips\n",
+        "\n",
+        "### Key Takeaways\n",
+        "\n",
+        "1. **Threshold Optimization**: Start conservative (0.10-0.15) and optimize based on real usage data\n",
+        "2. **Doc-to-Cache**: Pre-populate your cache with high-quality FAQs for immediate benefits\n",
+        "3. **Monitoring**: Track cache hit rates and adjust thresholds as user patterns emerge\n",
+        "4. **Model Selection**: The `langcache-embed-v1` model is specifically optimized for caching tasks\n",
+        "5. **Cost-Performance Balance**: Even a 50% cache hit rate provides significant cost savings\n",
+        "\n",
+        "### When to Use Semantic Caching\n",
+        "\n",
+        "✅ **Good Use Cases:**\n",
+        "- High-traffic applications with repeated question patterns\n",
+        "- Customer support chatbots\n",
+        "- FAQ systems\n",
+        "- Documentation Q&A\n",
+        "- Product information queries\n",
+        "- Educational content Q&A\n",
+        "\n",
+        "❌ **Less Suitable:**\n",
+        "- Highly dynamic content requiring real-time data\n",
+        "- Creative writing tasks needing variety\n",
+        "- Personalized responses based on user-specific context\n",
+        "- Time-sensitive queries (use TTL if needed)\n",
+        "\n",
+        "### Performance Tips\n",
+        "\n",
+        "1. **Batch Loading**: Pre-populate cache with Doc-to-Cache for immediate value\n",
+        "2. **Monitor Hit Rates**: Track and adjust thresholds based on production metrics\n",
+        "3. **A/B Testing**: Test different thresholds with a subset of traffic\n",
+        "4. **Cache Warming**: Regularly update cache with trending topics\n",
+        "5. **TTL Management**: Set time-to-live for entries that may become stale\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## 9. Cleanup\n",
+        "\n",
+        "Clean up resources when done.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 20,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# Clear cache contents\n",
+        "#cache.clear()\n",
+        "# print(\"Cache contents cleared\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Summary\n",
+        "\n",
+        "Congratulations! You've completed this comprehensive guide on semantic caching with LangCache and RedisVL. \n",
+        "\n",
+        "**What You've Learned:**\n",
+        "- ✅ Set up and configure LangCache with Redis Cloud\n",
+        "- ✅ Load and process PDF documents into knowledge bases\n",
+        "- ✅ Generate FAQs using the Doc-to-Cache technique with LLMs\n",
+        "- ✅ Pre-populate a semantic cache with tagged entries\n",
+        "- ✅ Test different cache matching strategies and thresholds\n",
+        "- ✅ Optimize cache performance using test datasets\n",
+        "- ✅ Leverage the `redis/langcache-embed-v1` cross-encoder model\n",
+        "- ✅ Integrate semantic caching into RAG pipelines\n",
+        "- ✅ Measure performance improvements and cost savings\n",
+        "\n",
+        "**Next Steps:**\n",
+        "- Experiment with different distance thresholds for your use case\n",
+        "- Try other embedding models and compare performance\n",
+        "- Implement cache analytics and monitoring in production\n",
+        "- Explore advanced features like TTL, metadata filtering, and cache warming strategies\n",
+        "- Scale your semantic cache to handle production traffic\n",
+        "\n",
+        "**Resources:**\n",
+        "- [RedisVL Documentation](https://docs.redisvl.com/en/stable/index.html)\n",
+        "- [LangCache Sign Up](https://redis.io/langcache/)\n",
+        "- [Redis AI Resources](https://github.com/redis-developer/redis-ai-resources)\n",
+        "- [Semantic Caching Paper](https://arxiv.org/abs/2504.02268)\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": []
+    }
+  ],
+  "metadata": {
+    "kernelspec": {
+      "display_name": "redis-ai-res",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.11.9"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 2
+}