diff --git a/ai/generative-ai-service/reranker-rag-demo/README.md b/ai/generative-ai-service/reranker-rag-demo/README.md new file mode 100644 index 000000000..27bc401ed --- /dev/null +++ b/ai/generative-ai-service/reranker-rag-demo/README.md @@ -0,0 +1,326 @@ +# OCI Reranker RAG Demo - Vision Corp Leave Policy + +A lightweight demo that shows the practical difference between using only a vector store and adding OCI Reranker on top of retrieval. + +The app uses a small leave-policy knowledge base derived from `Vision Corp Leave policy.pdf`. A user asks a policy question, the backend retrieves candidate passages with OCI embeddings and FAISS, and the UI lets you switch between: + +- **Vector store only:** answer from the highest cosine-similarity document. +- **OCI Reranker on:** answer from the same retrieved candidates after OCI Reranker reorders them by query-document relevance. + +This makes the reranker effect visible: the retrieved candidate list can contain the right document, but plain vector similarity may not rank it first. The reranker can promote the better passage before the answer is shown. + +## Screenshots + +### Vector Store Only + +![Vector search only](files/screenshots/vector-search-only.png) + +### OCI Reranker Enabled + +![OCI Reranker enabled](files/screenshots/oci-reranker-enabled.png) + +### Side-by-Side Ranking Comparison + +![Ranking comparison](files/screenshots/ranking-comparison.png) + +## What This Demo Does + +- Loads a PDF-derived knowledge base from `files/knowledge_base.json`. +- Embeds every knowledge-base chunk with OCI Generative AI embeddings. +- Stores normalized vectors in an in-memory FAISS `IndexFlatIP` index. +- Embeds the user query with the same OCI embedding model. +- Retrieves the top-k candidates by cosine similarity. +- Optionally sends those same candidates to OCI Generative AI `RerankText`. +- Shows the answer, citations, vector ranking, reranked ranking, and score differences in the browser. + +The demo intentionally uses compound policy questions, such as asking about public holidays and annual leave in the same query, so the value of reranking is easier to see. + +## Demo Questions + +The UI includes three preset questions: + +1. **Annual days:** What public holidays are listed, and how many annual leave days do Netherlands and Poland employees get? +2. **Sick certificate:** If leave is without manager approval it may be unpaid, but what if the absence is illness over two days? +3. **Compassionate:** Who counts as a direct relative, and how much compassionate leave is given for death of a spouse or direct relative? + +You can also type your own question in the text box. + +## Architecture + +```mermaid +flowchart LR + UI[Browser UI] --> API[Python HTTP backend] + KB[knowledge_base.json] --> API + API --> EmbedDocs[OCI EmbedText documents] + EmbedDocs --> FAISS[FAISS IndexFlatIP vector store] + UI --> Query[User question] + Query --> API + API --> EmbedQuery[OCI EmbedText query] + EmbedQuery --> FAISS + FAISS --> Candidates[Top-k vector candidates] + Candidates --> VectorAnswer[Vector-only answer] + Candidates --> Rerank[OCI RerankText] + Rerank --> RerankedAnswer[Reranked answer] + VectorAnswer --> UI + RerankedAnswer --> UI +``` + +## How Retrieval Works + +### 1. Knowledge Base + +The runtime knowledge base is persisted in: + +```text +files/knowledge_base.json +``` + +It currently contains 12 chunks from the Vision Corp leave policy, including annual leave, sick leave, maternity and paternity leave, compassionate leave, public holidays, manager approval, and direct-relative definitions. + +The app does not read the PDF at runtime. The PDF content has already been converted into structured JSON chunks with this shape: + +```json +{ + "id": "annual-leave", + "title": "Annual Leave", + "source": "Vision Corp Leave policy.pdf - Section 2", + "text": "...policy passage...", + "tags": ["annual leave", "Netherlands", "Poland"], + "answer": "...grounded answer used by the demo..." +} +``` + +### 2. Vector Store + +On the first query, `server.py`: + +1. Reads `knowledge_base.json`. +2. Sends each document passage to OCI `EmbedText` with `input_type=SEARCH_DOCUMENT`. +3. Normalizes the embedding matrix with `faiss.normalize_L2`. +4. Builds an in-memory FAISS `IndexFlatIP` index. + +Because the vectors are normalized, FAISS inner product is used as cosine similarity. + +The vector index is not written to disk. It is rebuilt in memory when the server starts or when the knowledge-base fingerprint changes. + +### 3. Query Search + +For each question, the backend: + +1. Sends the query to OCI `EmbedText` with `input_type=SEARCH_QUERY`. +2. Normalizes the query vector. +3. Searches FAISS for the top-k most similar chunks. +4. Returns the vector results and cosine scores to the UI. + +### 4. OCI Reranking + +When the switch is on, the backend sends the retrieved candidates to OCI Generative AI `RerankText`: + +```text +input: the user question +documents: top-k passages from vector search +top_n: number of candidates to return +model: cohere.rerank-v4.0-fast +region: me-riyadh-1 +``` + +OCI returns a `relevance_score` for each candidate. The app then reorders the same vector-retrieved candidates using that reranker score. + +### 5. Answer Display + +This demo keeps generation simple and transparent: it displays the curated `answer` field from the top-ranked document. + +That means: + +- In vector-only mode, the answer comes from the top vector result. +- In reranker mode, the answer comes from the top reranked result. + +No LLM chat generation is currently used after retrieval. This keeps the demo focused on proving the retrieval and reranking difference. You can extend it later by sending the top reranked passages into an LLM prompt. + +## Scores Explained + +The UI shows two different score types: + +- **Cosine score:** Produced by FAISS from normalized OCI embeddings. This is used in vector-only retrieval. +- **Relevance score:** Returned by OCI Reranker. This is the reranker model's relevance score for a query-document pair. + +These scores are real, but they are not the same scale. Compare cosine scores with cosine scores, and reranker relevance scores with reranker relevance scores. The important signal is how the candidate order changes. + +## Tech Stack + +- Frontend: HTML, CSS, vanilla JavaScript +- Backend: Python `http.server` with custom API handlers +- Embeddings: OCI Generative AI `cohere.embed-v4.0` +- Reranker: OCI Generative AI `cohere.rerank-v4.0-fast` +- Vector search: FAISS `IndexFlatIP` +- Knowledge base: local JSON file generated from the policy PDF + +## Project Structure + +```text +Reranker Demo/ +|-- README.md +`-- files/ + |-- app.js + |-- index.html + |-- knowledge_base.json + |-- screenshots/ + | |-- vector-search-only.png + | |-- oci-reranker-enabled.png + | `-- ranking-comparison.png + |-- server.py + `-- styles.css +``` + +## Setup + +### 1. Install Python Dependencies + +From the app folder: + +```powershell +cd files +python -m venv .venv +.\.venv\Scripts\Activate.ps1 +pip install oci numpy faiss-cpu +``` + +If you already have these packages installed globally, you can run the server without creating a virtual environment. + +### 2. Configure OCI Credentials + +The backend reads your OCI config from `~/.oci/config` by default and uses the `DEFAULT` profile unless overridden. + +Required OCI access: + +- A valid OCI config profile with API key authentication. +- A compartment with permission to call OCI Generative AI inference. +- Access to OCI Generative AI in the Riyadh region, `me-riyadh-1`. + +Recommended PowerShell environment variables: + +```powershell +$env:OCI_CONFIG_PROFILE="DEFAULT" +$env:OCI_REGION="me-riyadh-1" +$env:OCI_COMPARTMENT_ID="" +$env:OCI_EMBED_MODEL_ID="cohere.embed-v4.0" +$env:OCI_RERANK_MODEL_ID="cohere.rerank-v4.0-fast" +``` + +Optional overrides: + +```powershell +$env:OCI_CONFIG_FILE="" +$env:OCI_GENAI_ENDPOINT="https://inference.generativeai.me-riyadh-1.oci.oraclecloud.com" +$env:OCI_EMBED_ENDPOINT_ID="" +$env:OCI_RERANK_ENDPOINT_ID="" +$env:OCI_EMBED_BATCH_SIZE="96" +$env:HOST="127.0.0.1" +$env:PORT="4173" +``` + +Do not commit your OCI private key, local OCI config, or secrets to GitHub. + +## Run Locally + +```powershell +cd files +python server.py +``` + +Then open: + +```text +http://127.0.0.1:4173/ +``` + +Health check: + +```powershell +Invoke-RestMethod -Uri http://127.0.0.1:4173/api/status +``` + +## API Endpoints + +### `GET /api/status` + +Returns OCI configuration status, selected region, embedding model, reranker model, and vector-store engine. + +### `POST /api/search` + +Runs vector retrieval, and optionally reranking. + +Example request: + +```json +{ + "query": "What public holidays are listed, and how many annual leave days do Netherlands and Poland employees get?", + "useReranker": true, + "topK": 12 +} +``` + +Example response fields: + +```json +{ + "answer": "...", + "answerMode": "reranker", + "vectorResults": [], + "rerankedResults": [], + "vector": {}, + "reranker": {}, + "timingsMs": {} +} +``` + +## Updating the Knowledge Base + +To change the demo content, edit: + +```text +files/knowledge_base.json +``` + +Each chunk should have a clear `title`, `source`, `text`, `tags`, and `answer`. The `text` field is what gets embedded and reranked. The `answer` field is what the demo displays when that chunk wins. + +After editing the JSON, refresh the browser or run another query. The backend fingerprints the knowledge base and rebuilds the in-memory FAISS index when the content changes. + +## Troubleshooting + +### `ModuleNotFoundError: No module named 'faiss'` + +Install FAISS for Python: + +```powershell +pip install faiss-cpu +``` + +### OCI returns `404 Authorization failed or requested resource not found` + +The code reached OCI, but the configured profile, compartment, model, or endpoint is not authorized. Check: + +- `OCI_COMPARTMENT_ID` +- IAM policies for Generative AI inference +- Region availability, especially `me-riyadh-1` +- Model IDs or dedicated endpoint OCIDs + +### Port `4173` is already in use + +Use another port: + +```powershell +$env:PORT="4174" +python server.py +``` + +### Reranker and vector answers look the same + +That can happen when vector search already ranks the best passage first. Use the compound preset questions or add overlapping KB chunks to make the reranking effect easier to demonstrate. + +## Notes for GitHub + +Do not commit OCI config files, private keys, `.env` files, or personal OCIDs. This demo reads sensitive deployment values from environment variables or your local OCI config at runtime. + +The app is intentionally simple: no build step, no frontend framework, and no database. It is meant to be a clear demo asset for explaining why reranking improves RAG retrieval quality. + diff --git a/ai/generative-ai-service/reranker-rag-demo/files/app.js b/ai/generative-ai-service/reranker-rag-demo/files/app.js new file mode 100644 index 000000000..1a68f35aa --- /dev/null +++ b/ai/generative-ai-service/reranker-rag-demo/files/app.js @@ -0,0 +1,349 @@ +const presets = { + meaning: "What public holidays are listed, and how many annual leave days do Netherlands and Poland employees get?", + useCases: "If leave is without manager approval it may be unpaid, but what if the absence is illness over two days?", + adoption: "Who counts as a direct relative, and how much compassionate leave is given for death of a spouse or direct relative?" +}; + +const RESULT_LIMIT = 12; + +const presetLabels = { + meaning: "Annual days", + useCases: "Sick certificate", + adoption: "Compassionate" +}; + +const ui = { + documentTitle: "Vision Corp Leave Policy RAG", + eyebrow: "Retrieval augmented generation", + title: "Vision Corp Leave Policy RAG", + queryLabel: "Question", + vectorMode: "Vector store only", + rerankMode: "OCI Reranker on", + toggleSr: "Use OCI Reranker", + steps: { + query: "Query", + vector: "Vector search", + rerank: "Rerank top-k", + answer: "Answer" + }, + answerEyebrow: "Generated response", + answerTitle: "RAG answer", + scoreEyebrow: "Ranking effect", + scoreTitle: "Top passages", + withoutReranker: "Without reranker - top 12", + withReranker: "With OCI Reranker - top 12", + vectorOnly: "Vector store", + callingOci: "Calling OCI", + ociReranker: "OCI Reranker", + ociFailed: "OCI call failed", + rankingStable: "Ranking stable", + startServer: "Start server.py", + retrieved: "retrieved", + sameRank: "same rank", + scoreNote: + "Vector score is cosine similarity from OCI embeddings in FAISS. Reranker score is OCI relevance_score. They are real scores, but different scales.", + up: (amount) => `up ${amount}`, + down: (amount) => `down ${amount}`, + movedUp: (title, amount) => `${title} moved up ${amount}`, + backendMessage: (message) => `Backend message: ${message}` +}; + +const queryInput = document.querySelector("#query-input"); +const rerankToggle = document.querySelector("#rerank-toggle"); +const rankList = document.querySelector("#rank-list"); +const answerCopy = document.querySelector("#answer-copy"); +const citationRow = document.querySelector("#citation-row"); +const answerMode = document.querySelector("#answer-mode"); +const liftBadge = document.querySelector("#lift-badge"); +const modelPill = document.querySelector("#model-pill"); +const modelPillText = document.querySelector("#model-pill-text"); +const vectorStack = document.querySelector("#vector-stack"); +const rerankStack = document.querySelector("#rerank-stack"); +const presetButtons = document.querySelectorAll(".preset-button"); +const scoreNote = document.querySelector("#score-note"); +const initialParams = new URLSearchParams(window.location.search); +const presetKeys = Object.keys(presets); + +let activePreset = presetKeys.includes(initialParams.get("preset")) + ? initialParams.get("preset") + : "meaning"; +let renderToken = 0; +let renderTimer; +let activeSearchController = null; + +function setText(selector, value) { + document.querySelector(selector).textContent = value; +} + +function formatScore(value) { + const score = Number(value); + if (!Number.isFinite(score)) { + return "--"; + } + return score.toFixed(3); +} + +function scoreValue(doc, mode) { + return mode === "reranker" ? doc.rerankScore : doc.vectorScore; +} + +function scoreLabel(mode) { + return mode === "reranker" ? "relevance" : "cosine"; +} + +function movementLabel(doc, index, mode) { + if (mode !== "reranker") { + return ui.retrieved; + } + + const movement = doc.vectorRank - (index + 1); + if (movement > 0) { + return ui.up(movement); + } + if (movement < 0) { + return ui.down(Math.abs(movement)); + } + return ui.sameRank; +} + +function movementClass(doc, index, mode) { + if (mode !== "reranker") { + return "movement"; + } + const movement = doc.vectorRank - (index + 1); + return movement < 0 ? "movement down" : "movement"; +} + +function renderCitations(results) { + citationRow.innerHTML = ""; + results.slice(0, 3).forEach((doc) => { + const citation = document.createElement("span"); + citation.className = "citation"; + citation.textContent = doc.source; + citationRow.append(citation); + }); +} + +function renderRankList(results, mode) { + rankList.innerHTML = ""; + + results.forEach((doc, index) => { + const card = document.createElement("article"); + card.className = `rank-card${index === 0 ? " is-top" : ""}`; + card.innerHTML = ` +
${index + 1}
+
+

${doc.title}

+

${doc.text}

+
${doc.tags.map((tag) => `${tag}`).join("")}
+
+
+ ${formatScore(scoreValue(doc, mode))} + ${scoreLabel(mode)} + ${movementLabel(doc, index, mode)} +
+ `; + rankList.append(card); + }); +} + +function renderMiniStack(target, results, mode) { + target.innerHTML = ""; + + if (!results.length) { + const item = document.createElement("div"); + item.className = "mini-empty"; + item.textContent = "Turn on OCI Reranker to compare the reordered candidates."; + target.append(item); + return; + } + + results.slice(0, RESULT_LIMIT).forEach((doc, index) => { + const item = document.createElement("div"); + item.className = "mini-item"; + item.innerHTML = ` +
${index + 1}
+
+ ${doc.title} + ${formatScore(scoreValue(doc, mode))} ${scoreLabel(mode)} +
+ `; + target.append(item); + }); +} + +function renderLoading(isReranking) { + document.body.classList.toggle("rerank-active", isReranking); + answerMode.textContent = isReranking ? ui.callingOci : ui.vectorOnly; + liftBadge.textContent = isReranking ? "Embedding search + rerank" : "Embedding search"; + answerCopy.textContent = isReranking + ? "Running vector search with OCI embeddings, then sending the retrieved candidates to OCI Reranker..." + : "Running vector search with OCI embeddings in the backend FAISS index..."; + citationRow.innerHTML = ""; + rankList.innerHTML = ""; + vectorStack.innerHTML = ""; + rerankStack.innerHTML = ""; + scoreNote.textContent = ui.scoreNote; +} + +function renderError(error) { + document.body.classList.remove("rerank-active"); + answerMode.textContent = ui.ociFailed; + liftBadge.textContent = "No fallback score"; + answerCopy.textContent = ui.backendMessage(error.message || "Search failed."); + citationRow.innerHTML = ""; + rankList.innerHTML = ""; + vectorStack.innerHTML = ""; + rerankStack.innerHTML = ""; +} + +function renderView(data) { + const isReranking = data.answerMode === "reranker"; + const activeResults = isReranking ? data.rerankedResults : data.vectorResults; + const promotedDoc = data.rerankedResults?.[0]; + + document.body.classList.toggle("rerank-active", isReranking); + answerMode.textContent = isReranking ? ui.ociReranker : ui.vectorOnly; + + if (isReranking && promotedDoc) { + const lift = promotedDoc.vectorRank - promotedDoc.rerankRank; + liftBadge.textContent = lift > 0 ? ui.movedUp(promotedDoc.title, lift) : ui.rankingStable; + } else { + liftBadge.textContent = `${data.vector.engine} - ${data.vector.metric}`; + } + + answerCopy.textContent = data.answer; + scoreNote.textContent = ui.scoreNote; + + renderCitations(activeResults); + renderRankList(activeResults, isReranking ? "reranker" : "vector"); + renderMiniStack(vectorStack, data.vectorResults || [], "vector"); + renderMiniStack(rerankStack, data.rerankedResults || [], "reranker"); +} + +async function requestSearch(query, useReranker, signal) { + const response = await fetch("/api/search", { + method: "POST", + signal, + headers: { + "Content-Type": "application/json" + }, + body: JSON.stringify({ + query, + useReranker, + topK: RESULT_LIMIT + }) + }); + const data = await response.json(); + if (!response.ok || !data.ok) { + throw new Error(data.message || "Search request failed."); + } + return data; +} + +async function fetchOciStatus() { + try { + const response = await fetch("/api/status"); + const data = await response.json(); + if (!response.ok || !data.ok) { + throw new Error(data.message || "OCI backend is not ready."); + } + modelPillText.textContent = `${data.embeddingModel} + ${data.rerankModel}`; + } catch (error) { + modelPillText.textContent = ui.startServer; + } +} + +async function render() { + const token = ++renderToken; + const query = queryInput.value.trim() || presets.meaning; + const isReranking = rerankToggle.checked; + + if (activeSearchController) { + activeSearchController.abort(); + } + + const controller = new AbortController(); + activeSearchController = controller; + renderLoading(isReranking); + + try { + const data = await requestSearch(query, isReranking, controller.signal); + if (token !== renderToken) { + return; + } + activeSearchController = null; + renderView(data); + } catch (error) { + if (error.name === "AbortError" || token !== renderToken) { + return; + } + activeSearchController = null; + renderError(error); + } +} + +function scheduleRender() { + window.clearTimeout(renderTimer); + renderTimer = window.setTimeout(render, 350); +} + +function initializeText() { + document.documentElement.lang = "en"; + document.documentElement.dir = "ltr"; + document.title = ui.documentTitle; + setText("#page-eyebrow", ui.eyebrow); + setText("#page-title", ui.title); + setText("#query-label", ui.queryLabel); + setText("#vector-mode-label", ui.vectorMode); + setText("#rerank-mode-label", ui.rerankMode); + setText("#toggle-sr", ui.toggleSr); + setText("#step-query", ui.steps.query); + setText("#step-vector", ui.steps.vector); + setText("#step-rerank", ui.steps.rerank); + setText("#step-answer", ui.steps.answer); + setText("#answer-eyebrow", ui.answerEyebrow); + setText("#answer-title", ui.answerTitle); + setText("#score-eyebrow", ui.scoreEyebrow); + setText("#score-title", ui.scoreTitle); + setText("#vector-mini-title", ui.withoutReranker); + setText("#rerank-mini-title", ui.withReranker); + + modelPill.setAttribute("aria-label", "Selected OCI models"); + document.querySelector("#control-strip").setAttribute("aria-label", "Demo controls"); + document.querySelector("#preset-row").setAttribute("aria-label", "Example questions"); + document.querySelector("#pipeline").setAttribute("aria-label", "RAG pipeline"); + citationRow.setAttribute("aria-label", "Cited passages"); + document.querySelector(".compare-grid").setAttribute("aria-label", "Side by side comparison"); + + presetButtons.forEach((button) => { + button.textContent = presetLabels[button.dataset.preset]; + button.classList.toggle("is-active", button.dataset.preset === activePreset); + }); + + queryInput.value = presets[activePreset]; + scoreNote.textContent = ui.scoreNote; +} + +queryInput.addEventListener("input", () => { + activePreset = null; + presetButtons.forEach((item) => item.classList.remove("is-active")); + scheduleRender(); +}); + +rerankToggle.addEventListener("change", render); + +presetButtons.forEach((button) => { + button.addEventListener("click", () => { + activePreset = button.dataset.preset; + presetButtons.forEach((item) => item.classList.remove("is-active")); + button.classList.add("is-active"); + queryInput.value = presets[activePreset]; + render(); + }); +}); + +rerankToggle.checked = initialParams.get("rerank") !== "0"; +initializeText(); +fetchOciStatus(); +render(); diff --git a/ai/generative-ai-service/reranker-rag-demo/files/index.html b/ai/generative-ai-service/reranker-rag-demo/files/index.html new file mode 100644 index 000000000..7914b085c --- /dev/null +++ b/ai/generative-ai-service/reranker-rag-demo/files/index.html @@ -0,0 +1,119 @@ + + + + + + Vision Corp Leave Policy RAG + + + +
+
+
+
+

Retrieval augmented generation

+

Vision Corp Leave Policy RAG

+
+
+
+ + cohere.rerank-v4.0-fast +
+
+
+ +
+ + +
+ + + +
+ +
+ Vector search only + + OCI Reranker on +
+
+ +
+
+ Q + Query +
+ +
+ V + Vector search +
+ +
+ R + Rerank top-k +
+ +
+ A + Answer +
+
+ +
+
+
+
+

Generated response

+

RAG answer

+
+ +
+

+
+
+ +
+
+
+

Ranking effect

+

Top passages

+
+ +
+
+

+
+
+ +
+
+
Without reranker
+
+
+
+
With OCI Reranker
+
+
+
+
+
+ + + + diff --git a/ai/generative-ai-service/reranker-rag-demo/files/knowledge_base.json b/ai/generative-ai-service/reranker-rag-demo/files/knowledge_base.json new file mode 100644 index 000000000..98075c434 --- /dev/null +++ b/ai/generative-ai-service/reranker-rag-demo/files/knowledge_base.json @@ -0,0 +1,98 @@ +[ + { + "id": "eligibility-definitions", + "title": "Eligibility and Definitions", + "source": "Vision Corp Leave policy.pdf - Section 1", + "text": "Vision Corp. Leave Policy applies to all permanent employees of Vision Corp. based in Netherlands and Poland. The Leave Year coincides with the Vision Corp. Financial Year, from 1 January to 31 December. Direct Relative means Parents, Grandparents, Siblings, Children, and Grandchildren for Compassionate Leave. For all other leaves, Direct Relative means Parents, Spouse, Siblings, or Children. Manager refers to the employee's direct or indirect line manager. Leave calculated based on calendar days includes any weekends or public holidays falling within the leave period. All leave types, except special religious leave, are an annual entitlement per Leave Year.", + "tags": ["eligibility", "definitions", "leave year", "direct relative", "Netherlands", "Poland"], + "answer": "The policy applies to permanent Vision Corp employees in the Netherlands and Poland. The leave year runs from 1 January to 31 December, and direct-relative definitions differ for compassionate leave versus other leave types." + }, + { + "id": "direct-relative-definition", + "title": "Direct Relative Definition for Compassionate Leave", + "source": "Vision Corp Leave policy.pdf - Section 1.3", + "text": "Direct Relative is defined as follows. For Compassionate Leave: Employee's Parents, Grandparents, Siblings, Children, and Grandchildren. For all other leaves: Employee's Parents, Spouse, Siblings, or Children.", + "tags": ["direct relative", "compassionate leave", "parents", "grandparents", "siblings", "children", "grandchildren", "spouse"], + "answer": "For compassionate leave, direct relatives are the employee's parents, grandparents, siblings, children, and grandchildren. For all other leave types, direct relatives are parents, spouse, siblings, or children." + }, + { + "id": "annual-leave", + "title": "Annual Leave", + "source": "Vision Corp Leave policy.pdf - Section 2", + "text": "Netherlands employees are entitled to 25 working days of paid annual leave per Leave Year. Poland employees are entitled to 20 working days of paid annual leave per Leave Year, or 26 days if the employee has over 10 years of work experience. Employees hired during the Leave Year are eligible for annual leave on a pro-rata basis. Employees must use their annual leave within the Leave Year; otherwise, it will be forfeited unless otherwise agreed by HR.", + "tags": ["annual leave", "paid leave", "Netherlands", "Poland", "pro-rata", "carry over", "forfeit"], + "answer": "Netherlands employees receive 25 working days of paid annual leave. Poland employees receive 20 working days, or 26 days with over 10 years of work experience. New hires receive leave pro-rata, and unused annual leave is forfeited unless HR agrees otherwise." + }, + { + "id": "leave-year-calendar-days", + "title": "Leave Year and Calendar-Day Rules", + "source": "Vision Corp Leave policy.pdf - Sections 1.2, 1.5-1.6", + "text": "The Leave Year coincides with the Vision Corp. Financial Year, from 1 January to 31 December. Leave calculated based on calendar days includes any weekends or public holidays falling within the period of the leave availed. All leave types, except special religious leave, are an annual entitlement per Leave Year.", + "tags": ["leave year", "calendar days", "weekends", "public holidays", "annual entitlement", "financial year"], + "answer": "The leave year runs from 1 January to 31 December. When leave is calculated in calendar days, weekends and public holidays inside the leave period count as part of the leave. All leave types except special religious leave are annual entitlements per leave year." + }, + { + "id": "sick-leave", + "title": "Sick Leave", + "source": "Vision Corp Leave policy.pdf - Section 3", + "text": "Employees are eligible for paid sick leave. In the Netherlands, employees receive up to 104 weeks of sick leave with full or partial pay, subject to Dutch labor laws and company policies. In Poland, employees receive 33 days of paid sick leave per year at 80% of salary, after which the benefit is covered by social security, ZUS. Employees must inform their Manager on the first day of illness. Sick leave exceeding two consecutive days must be accompanied by a medical certificate.", + "tags": ["sick leave", "medical certificate", "illness", "Netherlands", "Poland", "ZUS", "manager"], + "answer": "In the Netherlands, sick leave can run up to 104 weeks with full or partial pay under Dutch law and company policy. In Poland, employees receive 33 days at 80% salary before ZUS covers the benefit. Employees must inform their manager on the first day of illness, and sick leave over two consecutive days requires a medical certificate." + }, + { + "id": "sick-leave-notice-certificate", + "title": "Sick Leave Notice and Medical Certificate", + "source": "Vision Corp Leave policy.pdf - Sections 3.2-3.3", + "text": "Employees must inform their Manager on the first day of illness. Sick leave exceeding two consecutive days must be accompanied by a medical certificate.", + "tags": ["sick leave", "illness", "manager", "medical certificate", "two consecutive days"], + "answer": "For illness, employees must inform their manager on the first day. If sick leave exceeds two consecutive days, they must provide a medical certificate." + }, + { + "id": "maternity-paternity", + "title": "Maternity and Paternity Leave", + "source": "Vision Corp Leave policy.pdf - Sections 4-5", + "text": "For maternity leave, Netherlands female employees are entitled to 16 weeks of fully paid maternity leave. Poland female employees are entitled to 20 weeks of paid maternity leave, with an option for additional leave. Employees must notify HR and their Manager at least 3 months before the expected delivery date. For paternity leave, Netherlands employees are entitled to 1 week of paid paternity leave plus an additional 5 weeks at 70% pay. Poland employees are entitled to 2 weeks of fully paid paternity leave. Paternity leave must be taken within 6 months of the child's birth or adoption.", + "tags": ["maternity leave", "paternity leave", "birth", "adoption", "Netherlands", "Poland", "HR notification"], + "answer": "Netherlands maternity leave is 16 fully paid weeks, while Poland maternity leave is 20 paid weeks with an option for additional leave. Employees must notify HR and their manager at least 3 months before delivery. For paternity leave, Netherlands provides 1 paid week plus 5 weeks at 70% pay, while Poland provides 2 fully paid weeks, taken within 6 months of birth or adoption." + }, + { + "id": "marriage-compassionate", + "title": "Marriage and Compassionate Leave", + "source": "Vision Corp Leave policy.pdf - Sections 6-7", + "text": "Employees are entitled to 3 working days of paid leave for their own wedding. Employees receive 1 day of paid leave for the wedding of a direct relative. For compassionate leave, employees receive 5 working days of paid leave for the death of a spouse. Employees receive 3 working days of paid leave for the death of a direct relative.", + "tags": ["marriage leave", "wedding", "compassionate leave", "death", "spouse", "direct relative"], + "answer": "For compassionate leave, direct relatives are parents, grandparents, siblings, children, and grandchildren. Employees receive 5 paid working days for the death of a spouse and 3 paid working days for the death of a direct relative. The policy also provides 3 paid working days for the employee's own wedding and 1 paid day for a direct relative's wedding." + }, + { + "id": "religious-unpaid-study", + "title": "Religious, Unpaid, and Study Leave", + "source": "Vision Corp Leave policy.pdf - Sections 8-10", + "text": "Employees who wish to observe special religious events may request up to 30 days of unpaid leave during their tenure at Vision Corp. Any leave availed without Manager approval will be considered unpaid leave and may lead to disciplinary action. Employees can apply for planned unpaid leave if they have exhausted their annual leave balance. Employees who have completed 2 years of service are eligible for 10 working days of study leave per year, only if enrolled in an accredited educational institution.", + "tags": ["religious leave", "unpaid leave", "study leave", "manager approval", "disciplinary action", "accredited education"], + "answer": "Special religious leave allows up to 30 unpaid days during an employee's tenure. Unapproved leave is treated as unpaid and may lead to disciplinary action, while planned unpaid leave can be requested after annual leave is exhausted. Study leave is 10 working days per year after 2 years of service, if the employee is enrolled in an accredited institution." + }, + { + "id": "unpaid-leave-manager-approval", + "title": "Unpaid Leave and Manager Approval", + "source": "Vision Corp Leave policy.pdf - Section 9", + "text": "Any leave availed without Manager approval will be considered Unpaid Leave and may lead to disciplinary action. Employees can apply for planned unpaid leave if they have exhausted their Annual Leave balance.", + "tags": ["unpaid leave", "manager approval", "disciplinary action", "annual leave balance"], + "answer": "Leave taken without manager approval is considered unpaid leave and may lead to disciplinary action. Planned unpaid leave can be requested after annual leave is exhausted." + }, + { + "id": "public-holidays-approval", + "title": "Public Holidays and Leave Approval", + "source": "Vision Corp Leave policy.pdf - Sections 11-12", + "text": "Employees are entitled to all declared public holidays in their respective country. Netherlands public holidays include New Year's Day, Easter, King's Day, Liberation Day, Ascension Day, Christmas, and others. Poland public holidays include New Year's Day, Easter, Constitution Day, All Saints' Day, Christmas, and others. Public holidays will be communicated by HR annually. Employees must schedule leave in advance and secure approval from their Manager. Employees must submit all leave requests via the HR Self-Service Portal. Unused annual leave may not be carried over unless HR grants an exception. The policy is subject to updates based on local labor laws and Vision Corp internal guidelines.", + "tags": ["public holidays", "leave approval", "HR portal", "manager approval", "carry over", "Netherlands", "Poland"], + "answer": "Employees receive declared public holidays for their country, communicated annually by HR. Leave must be scheduled in advance, approved by the manager, and submitted through the HR Self-Service Portal. Unused annual leave cannot be carried over unless HR grants an exception." + }, + { + "id": "public-holiday-examples", + "title": "Public Holiday Examples by Country", + "source": "Vision Corp Leave policy.pdf - Section 11", + "text": "Employees are entitled to all declared Public Holidays as per their respective country. Netherlands public holidays include New Year's Day, Easter, King's Day, Liberation Day, Ascension Day, Christmas, and others. Poland public holidays include New Year's Day, Easter, Constitution Day, All Saints' Day, Christmas, and others. Public holidays will be communicated by HR annually.", + "tags": ["public holidays", "Netherlands", "Poland", "Easter", "Christmas", "King's Day", "Constitution Day"], + "answer": "Employees receive declared public holidays in their country. Netherlands examples include New Year's Day, Easter, King's Day, Liberation Day, Ascension Day, and Christmas. Poland examples include New Year's Day, Easter, Constitution Day, All Saints' Day, and Christmas. HR communicates public holidays annually." + } +] diff --git a/ai/generative-ai-service/reranker-rag-demo/files/screenshots/oci-reranker-enabled.png b/ai/generative-ai-service/reranker-rag-demo/files/screenshots/oci-reranker-enabled.png new file mode 100644 index 000000000..1b2d3c627 Binary files /dev/null and b/ai/generative-ai-service/reranker-rag-demo/files/screenshots/oci-reranker-enabled.png differ diff --git a/ai/generative-ai-service/reranker-rag-demo/files/screenshots/ranking-comparison.png b/ai/generative-ai-service/reranker-rag-demo/files/screenshots/ranking-comparison.png new file mode 100644 index 000000000..e39be251d Binary files /dev/null and b/ai/generative-ai-service/reranker-rag-demo/files/screenshots/ranking-comparison.png differ diff --git a/ai/generative-ai-service/reranker-rag-demo/files/screenshots/vector-search-only.png b/ai/generative-ai-service/reranker-rag-demo/files/screenshots/vector-search-only.png new file mode 100644 index 000000000..81f847f40 Binary files /dev/null and b/ai/generative-ai-service/reranker-rag-demo/files/screenshots/vector-search-only.png differ diff --git a/ai/generative-ai-service/reranker-rag-demo/files/server.py b/ai/generative-ai-service/reranker-rag-demo/files/server.py new file mode 100644 index 000000000..c22803555 --- /dev/null +++ b/ai/generative-ai-service/reranker-rag-demo/files/server.py @@ -0,0 +1,456 @@ +import hashlib +import json +import mimetypes +import os +import time +from http import HTTPStatus +from http.server import SimpleHTTPRequestHandler, ThreadingHTTPServer +from pathlib import Path +from urllib.parse import urlparse + +import faiss +import numpy as np +import oci +from oci.generative_ai_inference import GenerativeAiInferenceClient +from oci.generative_ai_inference import models + + +ROOT = Path(__file__).resolve().parent +KB_PATH = ROOT / "knowledge_base.json" + +DEFAULT_EMBED_MODEL_ID = "cohere.embed-v4.0" +DEFAULT_RERANK_MODEL_ID = "cohere.rerank-v4.0-fast" +DEFAULT_REGION = "me-riyadh-1" +DEFAULT_TOP_K = 5 + +VECTOR_STORE_CACHE = {} + + +def get_env(name, default=None): + value = os.getenv(name) + return value if value not in (None, "") else default + + +def make_serving_mode(endpoint_env, model_env, default_model): + endpoint_id = get_env(endpoint_env) + model_id = get_env(model_env, default_model) + + if endpoint_id: + return { + "mode": models.DedicatedServingMode(endpoint_id=endpoint_id), + "label": endpoint_id, + "serving_mode_label": "DEDICATED", + } + + return { + "mode": models.OnDemandServingMode(model_id=model_id), + "label": model_id, + "serving_mode_label": "ON_DEMAND", + } + + +def load_oci_settings(): + profile = get_env("OCI_CONFIG_PROFILE", "DEFAULT") + config_path = get_env("OCI_CONFIG_FILE", "~/.oci/config") + config = oci.config.from_file(config_path, profile) + + region = get_env("OCI_REGION", DEFAULT_REGION) + config["region"] = region + configured_compartment = config.get("compartment_id") + compartment_id = get_env("OCI_COMPARTMENT_ID", configured_compartment) + if not compartment_id: + raise ValueError("Set OCI_COMPARTMENT_ID or include compartment_id in your OCI config.") + + service_endpoint = get_env( + "OCI_GENAI_ENDPOINT", + f"https://inference.generativeai.{region}.oci.oraclecloud.com", + ) + client_kwargs = {} + if service_endpoint: + client_kwargs["service_endpoint"] = service_endpoint + + embed = make_serving_mode( + "OCI_EMBED_ENDPOINT_ID", + "OCI_EMBED_MODEL_ID", + DEFAULT_EMBED_MODEL_ID, + ) + rerank = make_serving_mode( + "OCI_RERANK_ENDPOINT_ID", + "OCI_RERANK_MODEL_ID", + DEFAULT_RERANK_MODEL_ID, + ) + + return { + "client": GenerativeAiInferenceClient(config, **client_kwargs), + "compartment_id": compartment_id, + "profile": profile, + "region": region, + "service_endpoint": service_endpoint, + "embed_serving_mode": embed["mode"], + "embed_serving_mode_label": embed["serving_mode_label"], + "embed_model_label": embed["label"], + "rerank_serving_mode": rerank["mode"], + "rerank_serving_mode_label": rerank["serving_mode_label"], + "rerank_model_label": rerank["label"], + "compartment_source": ( + "OCI_COMPARTMENT_ID" + if os.getenv("OCI_COMPARTMENT_ID") + else "config compartment_id" + ), + } + + +def load_knowledge_base(): + with KB_PATH.open("r", encoding="utf-8") as file: + docs = json.load(file) + + if not isinstance(docs, list) or not docs: + raise ValueError("knowledge_base.json must contain at least one document.") + return docs + + +def extract_document_text(document): + if isinstance(document, str): + return document + + title = document.get("title", "") + text = document.get("text", "") + tags = ", ".join(document.get("tags", [])) + source = document.get("source", "") + return f"Title: {title}\nSource: {source}\nTags: {tags}\nText: {text}".strip() + + +def public_document(document): + return { + "id": document.get("id"), + "title": document.get("title"), + "source": document.get("source"), + "text": document.get("text"), + "tags": document.get("tags", []), + "answer": document.get("answer", ""), + } + + +def normalize_rank_index(index, total): + if isinstance(index, int) and 0 <= index < total: + return index + if isinstance(index, int) and 1 <= index <= total: + return index - 1 + return None + + +def normalize_matrix(vectors): + matrix = np.asarray(vectors, dtype=np.float32) + if matrix.ndim != 2 or matrix.shape[0] == 0: + raise ValueError("Embedding response did not contain a 2D vector array.") + faiss.normalize_L2(matrix) + return matrix + + +def extract_embeddings(data): + if getattr(data, "embeddings", None): + return data.embeddings + + by_type = getattr(data, "embeddings_by_type", None) + if isinstance(by_type, dict): + for key in ("float", "FLOAT", models.EmbedTextDetails.EMBEDDING_TYPES_FLOAT): + if key in by_type: + return by_type[key] + + raise ValueError("OCI embedding response did not include float embeddings.") + + +def embed_texts(settings, texts, input_type): + if not texts: + return [], None, None + + batch_size = int(get_env("OCI_EMBED_BATCH_SIZE", "96")) + embeddings = [] + model_id = None + model_version = None + + for start in range(0, len(texts), batch_size): + batch = texts[start : start + batch_size] + details = models.EmbedTextDetails( + inputs=batch, + compartment_id=settings["compartment_id"], + serving_mode=settings["embed_serving_mode"], + truncate=models.EmbedTextDetails.TRUNCATE_END, + input_type=input_type, + embedding_types=[models.EmbedTextDetails.EMBEDDING_TYPES_FLOAT], + ) + response = settings["client"].embed_text(details) + embeddings.extend(extract_embeddings(response.data)) + model_id = response.data.model_id or model_id + model_version = response.data.model_version or model_version + + return embeddings, model_id, model_version + + +def vector_store_fingerprint(docs, settings): + payload = { + "docs": docs, + "embed_model": settings["embed_model_label"], + "region": settings["region"], + "endpoint": settings["service_endpoint"], + } + encoded = json.dumps(payload, sort_keys=True, ensure_ascii=False).encode("utf-8") + return hashlib.sha256(encoded).hexdigest() + + +def get_vector_store(settings): + docs = load_knowledge_base() + fingerprint = vector_store_fingerprint(docs, settings) + cached = VECTOR_STORE_CACHE.get("store") + if cached and cached["fingerprint"] == fingerprint: + return cached + + document_texts = [extract_document_text(doc) for doc in docs] + embeddings, model_id, model_version = embed_texts( + settings, + document_texts, + models.EmbedTextDetails.INPUT_TYPE_SEARCH_DOCUMENT, + ) + vectors = normalize_matrix(embeddings) + index = faiss.IndexFlatIP(vectors.shape[1]) + index.add(vectors) + + store = { + "fingerprint": fingerprint, + "docs": docs, + "index": index, + "dimension": int(vectors.shape[1]), + "embedding_model": model_id or settings["embed_model_label"], + "embedding_model_version": model_version, + "metric": "cosine similarity", + "engine": "FAISS IndexFlatIP", + } + VECTOR_STORE_CACHE["store"] = store + return store + + +def vector_search(settings, query, top_k): + store = get_vector_store(settings) + query_embeddings, model_id, model_version = embed_texts( + settings, + [query], + models.EmbedTextDetails.INPUT_TYPE_SEARCH_QUERY, + ) + query_vector = normalize_matrix(query_embeddings) + limit = min(top_k, len(store["docs"])) + scores, indexes = store["index"].search(query_vector, limit) + + results = [] + for rank, (score, index) in enumerate(zip(scores[0], indexes[0]), start=1): + if index < 0: + continue + doc = public_document(store["docs"][int(index)]) + doc["vectorRank"] = rank + doc["vectorScore"] = float(score) + results.append(doc) + + vector_meta = { + "engine": store["engine"], + "metric": store["metric"], + "embeddingModel": model_id or store["embedding_model"], + "embeddingModelVersion": model_version or store["embedding_model_version"], + "dimension": store["dimension"], + "topK": limit, + } + return results, vector_meta + + +def rerank_results(settings, query, vector_results): + documents = [extract_document_text(doc) for doc in vector_results] + details = models.RerankTextDetails( + input=query, + compartment_id=settings["compartment_id"], + serving_mode=settings["rerank_serving_mode"], + documents=documents, + top_n=len(documents), + is_echo=False, + ) + response = settings["client"].rerank_text(details) + + ranked = [] + used = set() + for rank in response.data.document_ranks: + normalized_index = normalize_rank_index(rank.index, len(vector_results)) + if normalized_index is None or normalized_index in used: + continue + used.add(normalized_index) + doc = dict(vector_results[normalized_index]) + doc["rerankScore"] = float(rank.relevance_score) + ranked.append(doc) + + for index, doc in enumerate(vector_results): + if index in used: + continue + doc = dict(doc) + doc["rerankScore"] = None + ranked.append(doc) + + for index, doc in enumerate(ranked, start=1): + doc["rerankRank"] = index + + return ranked, { + "provider": "OCI Generative AI RerankText", + "model": response.data.model_id or settings["rerank_model_label"], + "modelVersion": response.data.model_version, + "servingMode": settings["rerank_serving_mode_label"], + "scoreName": "relevance score", + } + + +def make_error_payload(error): + if isinstance(error, oci.exceptions.ServiceError): + return { + "error": "ServiceError", + "status": error.status, + "code": error.code, + "message": error.message, + "opcRequestId": error.request_id, + } + + return { + "error": error.__class__.__name__, + "message": str(error), + } + + +class Handler(SimpleHTTPRequestHandler): + server_version = "OCIRerankerDemo/2.0" + + def translate_path(self, path): + parsed = urlparse(path) + requested = parsed.path.lstrip("/") or "index.html" + safe_parts = [part for part in requested.split("/") if part not in ("", ".", "..")] + return str(ROOT.joinpath(*safe_parts)) + + def log_message(self, format, *args): + return + + def send_json(self, status, payload): + body = json.dumps(payload, indent=2).encode("utf-8") + self.send_response(status) + self.send_header("Content-Type", "application/json; charset=utf-8") + self.send_header("Cache-Control", "no-store") + self.send_header("Content-Length", str(len(body))) + self.end_headers() + self.wfile.write(body) + + def do_GET(self): + parsed = urlparse(self.path) + if parsed.path == "/api/status": + try: + settings = load_oci_settings() + self.send_json( + HTTPStatus.OK, + { + "ok": True, + "profile": settings["profile"], + "region": settings["region"], + "endpoint": settings["service_endpoint"], + "vectorStore": "OCI embeddings + FAISS", + "embeddingModel": settings["embed_model_label"], + "rerankModel": settings["rerank_model_label"], + "embeddingServingMode": settings["embed_serving_mode_label"], + "rerankServingMode": settings["rerank_serving_mode_label"], + "compartmentSource": settings["compartment_source"], + }, + ) + except Exception as error: + self.send_json(HTTPStatus.INTERNAL_SERVER_ERROR, {"ok": False, **make_error_payload(error)}) + return + + path = Path(self.translate_path(self.path)) + if path.is_dir(): + path = path / "index.html" + if not path.exists() or not path.is_file(): + self.send_error(HTTPStatus.NOT_FOUND) + return + + content_type = mimetypes.guess_type(path.name)[0] or "application/octet-stream" + data = path.read_bytes() + self.send_response(HTTPStatus.OK) + self.send_header("Content-Type", content_type) + self.send_header("Cache-Control", "no-store") + self.send_header("Content-Length", str(len(data))) + self.end_headers() + self.wfile.write(data) + + def do_POST(self): + parsed = urlparse(self.path) + if parsed.path not in ("/api/search", "/api/rerank"): + self.send_error(HTTPStatus.NOT_FOUND) + return + + try: + content_length = int(self.headers.get("Content-Length", "0")) + payload = json.loads(self.rfile.read(content_length) or b"{}") + query = payload.get("query", "").strip() + top_k = int(payload.get("topK") or payload.get("topN") or DEFAULT_TOP_K) + use_reranker = bool(payload.get("useReranker", parsed.path == "/api/rerank")) + + if not query: + self.send_json(HTTPStatus.BAD_REQUEST, {"ok": False, "error": "BadRequest", "message": "query is required"}) + return + + top_k = max(1, min(top_k, 20)) + started = time.perf_counter() + settings = load_oci_settings() + vector_results, vector_meta = vector_search(settings, query, top_k) + vector_finished = time.perf_counter() + + reranked_results = [] + reranker_meta = { + "enabled": False, + "model": settings["rerank_model_label"], + "servingMode": settings["rerank_serving_mode_label"], + "scoreName": "relevance score", + } + if use_reranker: + reranked_results, reranker_meta = rerank_results(settings, query, vector_results) + reranker_meta["enabled"] = True + + active_results = reranked_results if use_reranker and reranked_results else vector_results + answer_doc = active_results[0] if active_results else {} + finished = time.perf_counter() + + self.send_json( + HTTPStatus.OK, + { + "ok": True, + "query": query, + "answer": answer_doc.get("answer", ""), + "answerSourceId": answer_doc.get("id"), + "answerMode": "reranker" if use_reranker else "vector", + "vector": vector_meta, + "reranker": reranker_meta, + "vectorResults": vector_results, + "rerankedResults": reranked_results, + "timingsMs": { + "vector": round((vector_finished - started) * 1000), + "reranker": round((finished - vector_finished) * 1000), + "total": round((finished - started) * 1000), + }, + }, + ) + except Exception as error: + self.send_json(HTTPStatus.INTERNAL_SERVER_ERROR, {"ok": False, **make_error_payload(error)}) + + +def main(): + host = get_env("HOST", "127.0.0.1") + port = int(get_env("PORT", "4173")) + server = ThreadingHTTPServer((host, port), Handler) + print(f"OCI Reranker demo running at http://{host}:{port}") + print("Using OCI config from", get_env("OCI_CONFIG_FILE", "~/.oci/config")) + print("Using OCI region", get_env("OCI_REGION", DEFAULT_REGION)) + print("Using OCI embedding model", get_env("OCI_EMBED_MODEL_ID", DEFAULT_EMBED_MODEL_ID)) + print("Using OCI reranker model", get_env("OCI_RERANK_MODEL_ID", DEFAULT_RERANK_MODEL_ID)) + server.serve_forever() + + +if __name__ == "__main__": + main() diff --git a/ai/generative-ai-service/reranker-rag-demo/files/styles.css b/ai/generative-ai-service/reranker-rag-demo/files/styles.css new file mode 100644 index 000000000..72e722aaf --- /dev/null +++ b/ai/generative-ai-service/reranker-rag-demo/files/styles.css @@ -0,0 +1,643 @@ +:root { + --bg: #f6f4f1; + --ink: #201f1d; + --muted: #6b6761; + --panel: #fffdfa; + --line: #ddd7ce; + --oracle: #c74634; + --oracle-dark: #8e2f23; + --teal: #0f766e; + --teal-soft: #ddf4f0; + --gold: #ad7a12; + --blue: #27548a; + --shadow: 0 18px 45px rgba(55, 48, 39, 0.12); + font-family: + Inter, ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", + sans-serif; +} + +* { + box-sizing: border-box; +} + +body { + min-height: 100vh; + margin: 0; + background: + linear-gradient(120deg, rgba(199, 70, 52, 0.08), transparent 34%), + linear-gradient(240deg, rgba(15, 118, 110, 0.1), transparent 38%), + var(--bg); + color: var(--ink); +} + +button, +textarea { + font: inherit; +} + +button { + cursor: pointer; +} + +.app-shell { + min-height: 100vh; + padding: 28px; +} + +.workspace { + width: min(1180px, 100%); + margin: 0 auto; +} + +.topbar { + display: flex; + align-items: flex-start; + justify-content: space-between; + gap: 24px; + margin-bottom: 18px; +} + +.top-actions { + display: grid; + justify-items: end; + gap: 10px; +} + +.language-switch { + display: inline-flex; + border: 1px solid var(--line); + border-radius: 999px; + background: rgba(255, 253, 250, 0.72); + padding: 4px; +} + +.language-button { + min-height: 30px; + border: 0; + border-radius: 999px; + background: transparent; + color: var(--muted); + padding: 5px 11px; + font-size: 0.78rem; + font-weight: 900; +} + +.language-button.is-active { + background: var(--teal); + color: #fff; +} + +.eyebrow { + margin: 0 0 6px; + color: var(--oracle-dark); + font-size: 0.74rem; + font-weight: 800; + letter-spacing: 0; + text-transform: uppercase; +} + +h1, +h2 { + margin: 0; + letter-spacing: 0; +} + +h1 { + max-width: 720px; + font-size: clamp(2rem, 4vw, 4rem); + line-height: 0.98; +} + +h2 { + font-size: 1.15rem; +} + +.model-pill, +.status-chip, +.lift-badge { + display: inline-flex; + min-height: 34px; + align-items: center; + gap: 8px; + border: 1px solid var(--line); + border-radius: 999px; + background: rgba(255, 253, 250, 0.72); + color: var(--muted); + padding: 7px 12px; + font-size: 0.82rem; + font-weight: 700; + white-space: nowrap; +} + +.model-pill { + direction: ltr; + max-width: min(560px, 100%); + overflow: hidden; + text-overflow: ellipsis; +} + +.signal-dot { + width: 9px; + height: 9px; + border-radius: 999px; + background: var(--teal); + box-shadow: 0 0 0 5px rgba(15, 118, 110, 0.14); +} + +.control-strip, +.pipeline, +.answer-panel, +.score-panel, +.mini-panel { + border: 1px solid var(--line); + border-radius: 8px; + background: rgba(255, 253, 250, 0.88); + box-shadow: var(--shadow); +} + +.control-strip { + display: grid; + grid-template-columns: minmax(0, 1fr) auto; + gap: 16px 20px; + align-items: end; + padding: 16px; +} + +.query-box { + display: grid; + gap: 8px; +} + +.query-box span { + color: var(--muted); + font-size: 0.82rem; + font-weight: 800; +} + +textarea { + width: 100%; + min-height: 72px; + resize: vertical; + border: 1px solid #cfc8bd; + border-radius: 8px; + background: #fff; + color: var(--ink); + padding: 13px 14px; + line-height: 1.45; + outline: none; +} + +[dir="rtl"] textarea { + direction: rtl; + text-align: right; +} + +textarea:focus { + border-color: var(--oracle); + box-shadow: 0 0 0 4px rgba(199, 70, 52, 0.13); +} + +.preset-row { + display: flex; + flex-wrap: wrap; + gap: 8px; +} + +.preset-button { + min-height: 38px; + border: 1px solid var(--line); + border-radius: 999px; + background: #fff; + color: var(--muted); + padding: 8px 12px; + font-size: 0.83rem; + font-weight: 800; +} + +.preset-button.is-active { + border-color: rgba(199, 70, 52, 0.34); + background: rgba(199, 70, 52, 0.1); + color: var(--oracle-dark); +} + +.switch-row { + display: flex; + align-items: center; + justify-content: flex-end; + gap: 12px; + min-width: 360px; +} + +.mode-label { + color: var(--muted); + font-size: 0.84rem; + font-weight: 800; +} + +.mode-label.strong { + color: var(--teal); +} + +.toggle { + position: relative; + display: inline-flex; + align-items: center; +} + +.toggle input { + position: absolute; + opacity: 0; + pointer-events: none; +} + +.toggle-track { + position: relative; + width: 66px; + height: 34px; + border: 1px solid #bbb2a4; + border-radius: 999px; + background: #ebe5dc; + transition: + background 180ms ease, + border-color 180ms ease; +} + +.toggle-thumb { + position: absolute; + top: 4px; + left: 4px; + width: 24px; + height: 24px; + border-radius: 50%; + background: #fff; + box-shadow: 0 3px 10px rgba(32, 31, 29, 0.22); + transition: transform 180ms ease; +} + +.toggle input:checked + .toggle-track { + border-color: rgba(15, 118, 110, 0.5); + background: var(--teal); +} + +.toggle input:checked + .toggle-track .toggle-thumb { + transform: translateX(32px); +} + +.toggle input:focus-visible + .toggle-track { + outline: 3px solid rgba(15, 118, 110, 0.28); + outline-offset: 3px; +} + +.pipeline { + display: grid; + grid-template-columns: auto 1fr auto 1fr auto 1fr auto; + align-items: center; + gap: 12px; + margin: 18px 0; + padding: 14px; +} + +.pipeline-step { + display: inline-flex; + align-items: center; + gap: 8px; + color: var(--muted); + font-size: 0.84rem; + font-weight: 900; + white-space: nowrap; +} + +.step-icon { + display: inline-grid; + width: 30px; + height: 30px; + place-items: center; + border-radius: 50%; + background: #efe8df; + color: var(--oracle-dark); + font-size: 0.76rem; +} + +.rerank-active .rerank-step .step-icon { + background: var(--teal); + color: #fff; +} + +.pipeline-line { + height: 3px; + border-radius: 999px; + background: linear-gradient(90deg, rgba(199, 70, 52, 0.35), rgba(15, 118, 110, 0.45)); +} + +.result-grid { + display: grid; + grid-template-columns: minmax(0, 0.88fr) minmax(0, 1.12fr); + gap: 18px; +} + +.answer-panel, +.score-panel, +.mini-panel { + padding: 18px; +} + +.panel-heading { + display: flex; + justify-content: space-between; + gap: 16px; + align-items: flex-start; + margin-bottom: 16px; +} + +[dir="rtl"] .panel-heading { + direction: rtl; +} + +.answer-copy { + margin: 0; + font-size: 1.14rem; + line-height: 1.58; +} + +.citation-row { + display: flex; + flex-wrap: wrap; + gap: 8px; + margin-top: 18px; +} + +.citation { + border: 1px solid var(--line); + border-radius: 999px; + background: #fff; + color: var(--muted); + padding: 7px 10px; + font-size: 0.78rem; + font-weight: 800; +} + +.rank-list { + display: grid; + gap: 10px; + max-height: 620px; + overflow-y: auto; + padding-right: 4px; +} + +.rank-card { + display: grid; + grid-template-columns: 40px minmax(0, 1fr) 86px; + gap: 12px; + align-items: start; + border: 1px solid var(--line); + border-radius: 8px; + background: #fff; + padding: 12px; +} + +.rank-card.is-top { + border-color: rgba(15, 118, 110, 0.38); + background: linear-gradient(90deg, rgba(221, 244, 240, 0.95), #fff 52%); +} + +.rank-number { + display: grid; + width: 34px; + height: 34px; + place-items: center; + border-radius: 50%; + background: #ede7de; + color: var(--oracle-dark); + font-weight: 900; +} + +.rank-card.is-top .rank-number { + background: var(--teal); + color: #fff; +} + +.doc-title { + margin: 0 0 6px; + font-weight: 900; +} + +.doc-copy { + margin: 0; + color: var(--muted); + font-size: 0.9rem; + line-height: 1.45; +} + +.tag-row { + display: flex; + flex-wrap: wrap; + gap: 6px; + margin-top: 9px; +} + +.tag { + border-radius: 999px; + background: #f1ece4; + color: #6d5547; + padding: 4px 7px; + font-size: 0.7rem; + font-weight: 900; +} + +.score-meter { + display: grid; + justify-items: end; + gap: 5px; +} + +[dir="rtl"] .score-meter { + justify-items: start; +} + +.score-value { + color: var(--blue); + font-size: 0.82rem; + font-weight: 900; +} + +.score-kind { + color: var(--muted); + font-size: 0.72rem; + font-weight: 900; + text-transform: uppercase; +} + +.movement { + color: var(--teal); + font-size: 0.75rem; + font-weight: 900; +} + +.movement.down { + color: var(--gold); +} + +.compare-grid { + display: grid; + grid-template-columns: repeat(2, minmax(0, 1fr)); + gap: 18px; + margin-top: 18px; +} + +.score-note { + margin: 14px 0 0; + color: var(--muted); + font-size: 0.78rem; + font-weight: 700; + line-height: 1.45; +} + +.mini-panel { + box-shadow: none; +} + +.mini-panel.accented { + border-color: rgba(15, 118, 110, 0.35); +} + +.mini-title { + margin-bottom: 12px; + color: var(--muted); + font-size: 0.85rem; + font-weight: 900; + text-transform: uppercase; +} + +.mini-stack { + display: grid; + max-height: 620px; + overflow-y: auto; + gap: 6px; + padding-right: 4px; +} + +.mini-item { + display: grid; + grid-template-columns: 30px minmax(0, 1fr); + gap: 9px; + align-items: center; + min-height: 42px; + border: 1px solid var(--line); + border-radius: 8px; + background: #fff; + padding: 7px 9px; +} + +.mini-item strong { + display: block; + overflow: hidden; + color: var(--ink); + font-size: 0.86rem; + text-overflow: ellipsis; + white-space: nowrap; +} + +.mini-item span { + color: var(--muted); + font-size: 0.75rem; + font-weight: 800; +} + +.mini-empty { + min-height: 48px; + border: 1px dashed var(--line); + border-radius: 8px; + color: var(--muted); + display: grid; + place-items: center; + padding: 10px; + font-size: 0.8rem; + font-weight: 800; + text-align: center; +} + +.mini-rank { + display: grid; + width: 27px; + height: 27px; + place-items: center; + border-radius: 50%; + background: #f0e9df; + color: var(--oracle-dark); + font-size: 0.8rem; + font-weight: 900; +} + +.accented .mini-rank { + background: var(--teal-soft); + color: var(--teal); +} + +.sr-only { + position: absolute; + width: 1px; + height: 1px; + overflow: hidden; + clip: rect(0, 0, 0, 0); + white-space: nowrap; +} + +@media (max-width: 900px) { + .app-shell { + padding: 18px; + } + + .topbar, + .control-strip, + .result-grid, + .compare-grid { + grid-template-columns: 1fr; + } + + .topbar { + display: grid; + } + + .top-actions { + justify-items: start; + } + + [dir="rtl"] .top-actions { + justify-items: end; + } + + .switch-row { + justify-content: flex-start; + min-width: 0; + } + + .pipeline { + grid-template-columns: 1fr; + } + + .pipeline-line { + width: 3px; + height: 18px; + margin-left: 14px; + } +} + +@media (max-width: 560px) { + h1 { + font-size: 2rem; + } + + .switch-row { + align-items: flex-start; + flex-direction: column; + } + + .rank-card { + grid-template-columns: 36px minmax(0, 1fr); + } + + .score-meter { + grid-column: 2; + justify-items: start; + } +}