Skip to content

Commit 8cda41d

Browse files
author
Sam Partee
authored
First pass at docs (#19)
General first pass at the documentation
1 parent c6f49d6 commit 8cda41d

30 files changed

+1518
-70
lines changed

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ mypy:
5858
# help: docs - generate project documentation
5959
.PHONY: docs
6060
docs:
61-
@cd doc; make html
61+
@cd docs; make html
6262

6363
# help:
6464
# help: Test

README.md

Lines changed: 97 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -1,100 +1,128 @@
1-
# RedisVL
1+
# RedisVL: Python Client Library for Redis as a Vector Database
22

3-
> DISCLAIMER: This project is still under signifigant development and should not be used in any production settings. We would love input/contributions as we finalize what the CLI and library interfaces should look like.
43

5-
A CLI and Library to help with loading data into Redis specifically for
6-
usage with RediSearch and Redis Vector Search capabilities
4+
[![Codecov](https://img.shields.io/codecov/c/github/RedisVentures/RedisVL/dev?label=Codecov&logo=codecov&token=E30WxqBeJJ)](https://codecov.io/gh/RedisVentures/RedisVL)
5+
[![License](https://img.shields.io/badge/License-BSD-3--blue.svg)](https://opensource.org/licenses/mit/)
76

8-
### Usage
97

10-
```
11-
usage: redisvl <command> [<args>]
8+
RedisVL provides a powerful Python client library for using Redis as a Vector Database. Leverage the speed and reliability of Redis along with vector-based semantic search capabilities to supercharge your application!
129

13-
Commands:
14-
load Load vector data into redis
15-
index Index manipulation (create, delete, etc.)
16-
query Query an existing index
10+
**Note:** This project is rapidly evolving, and the API may change frequently. Always refer to the most recent [documentation](https://redisvl.com/docs).
11+
## 🚀 What is RedisVL?
1712

18-
Redis Vector load CLI
13+
Vector databases have become increasingly popular in recent years due to their ability to store and retrieve vectors efficiently. However, most vector databases are complex to use and require a lot of time and effort to set up. RedisVL aims to solve this problem by providing a simple and intuitive interface for using Redis as a vector database.
1914

20-
positional arguments:
21-
command Subcommand to run
15+
RedisVL provides a client library that enables you to harness the power of Redis as a vector database. This library simplifies the process of storing, retrieving, and performing semantic searches on vectors in Redis. It also provides a robust index management system that allows you to create, update, and delete indices with ease.
2216

23-
optional arguments:
24-
-h, --help show this help message and exit
2517

26-
```
18+
### Capabilities
19+
20+
RedisVL has a host of powerful features designed to streamline your vector database operations.
21+
22+
1. **Index Management**: RedisVL allows for indices to be created, updated, and deleted with ease. A schema for each index can be defined in yaml or directly in python code and used throughout the lifetime of the index.
23+
24+
2. **Vector Creation**: RedisVL integrates with OpenAI and other embedding providers to make the process of creating vectors straightforward.
25+
26+
3. **Vector Search**: RedisVL provides robust search capabilities that enable you to query vectors synchronously and asynchronously. Hybrid queries that utilize tag, geographic, numeric, and other filters like full-text search are also supported.
27+
28+
4. **Semantic Caching**: ``LLMCache`` is a semantic caching interface built directly into RedisVL. It allows for the caching of generated output from LLM models like GPT-3 and others. As semantic search is used to check the cache, a threshold can be set to determine if the cached result is relevant enough to be returned. If not, the model is called and the result is cached for future use. This can increase the QPS and reduce the cost of using LLM models.
29+
30+
31+
## 😊 Quick Start
2732

28-
For any of the above commands, you will need to have an index schema written
29-
into a yaml file for the cli to read. The format of the schema is as follows
33+
Please note that this library is still under heavy development, and while you can quickly try RedisVL and deploy it in a production environment, the API may be subject to change at any time.
34+
35+
`pip install redisvl`
36+
37+
## Example Usage
38+
39+
### Index Management
40+
41+
Indices can be defined through yaml specification that corresponds directly to the RediSearch field names and arguments in redis-py
3042

3143
```yaml
3244
index:
33-
name: sample # index name used for querying
45+
name: users
3446
storage_type: hash
35-
key_field: "id" # column name to use for key in redis
36-
prefix: vector # prefix used for all loaded docs
47+
prefix: "user:"
48+
key_field: "id"
3749

38-
# all fields to create index with
39-
# sub-items correspond to redis-py Field arguments
4050
fields:
51+
# define tag fields
4152
tag:
42-
categories: # name of a tag field used for queries
43-
separator: "|"
44-
year: # name of a tag field used for queries
45-
separator: "|"
53+
- name: users
54+
- name: job
55+
- name: credit_store
56+
# define numeric fields
57+
numeric:
58+
- name: age
59+
# define vector fields
4660
vector:
47-
vector: # name of the vector field used for queries
48-
datatype: "float32"
49-
algorithm: "flat" # flat or HSNW
50-
dims: 768
51-
distance_metric: "cosine" # ip, L2, cosine
61+
- name: user_embedding
62+
algorithm: hnsw
63+
distance_metric: cosine
5264
```
5365
54-
#### Example Usage
55-
These examples reference [provided sample data](sample-data/).
66+
This would correspond to a dataset that looked something like
5667
57-
```bash
58-
# load in a pickled dataframe with
59-
redisvl load -s sample-data/sample.yml -d sample-data/pandas-sample.pkl
60-
```
68+
| users | age | job | credit_score | user_embedding |
69+
|-------|-----|------------|--------------|-----------------------------------|
70+
| john | 1 | engineer | high | \x3f\x8c\xcc\x3f\x8c\xcc?@ |
71+
| mary | 2 | doctor | low | \x3f\x8c\xcc\x3f\x8c\xcc?@ |
72+
| joe | 3 | dentist | medium | \x3f\xab\xcc?\xab\xcc?@ |
6173
62-
```bash
63-
# load in a pickled dataframe to a specific address and port
64-
redisvl load -s sample-data/sample.yml -d sample-data/pandas-sample.pkl -h 127.0.0.1 -p 6379
65-
```
6674
67-
```bash
68-
# load in a pickled dataframe to a specific
69-
# address and port and with password
70-
redisvl load -s sample-data/sample.yml -d sample-data/pandas-sample.pkl -h 127.0.0.1 -p 6379 -p supersecret
71-
```
75+
With the schema, the RedisVL library can be used to create, load vectors and perform vector searches
76+
```python
77+
from redisvl.index import SearchIndex
78+
from redisvl.query import create_vector_query
7279

73-
### Support
80+
# define and create the index
81+
index = SearchIndex.from_yaml("./users_schema.yml"))
82+
index.connect("redis://localhost:6379")
83+
index.create()
7484

75-
#### Supported Index Fields
85+
index.load(pd.read_csv("./users.csv").to_records())
7686

77-
- ``geo``
78-
- ``tag``
79-
- ``numeric``
80-
- ``vector``
81-
- ``text``
82-
#### Supported Data Types
83-
- Pandas DataFrame (pickled)
84-
#### Supported Redis Data Types
85-
- Hash
86-
- JSON (soon)
87+
query = create_vector_query(
88+
["users", "age", "job", "credit_score"],
89+
number_of_results=2,
90+
vector_field_name="user_embedding",
91+
)
8792

88-
### Install
89-
Install the Python requirements listed in `requirements.txt`.
93+
query_vector = np.array([0.1, 0.1, 0.5]).tobytes()
94+
results = index.search(query, query_params={"vector": query_vector})
95+
96+
```
9097
91-
```bash
92-
git clone https://github.com/RedisVentures/data-loader.git
93-
cd redisvl
94-
pip install .
98+
### Semantic cache
99+
100+
The ``LLMCache`` Interface in RedisVL can be used as follows.
101+
102+
```python
103+
# init open ai client
104+
import openai
105+
openai.api_key = "sk-xxx"
106+
107+
from redisvl.llmcache.semantic import SemanticCache
108+
cache = SemanticCache(redis_host="localhost", redis_port=6379, redis_password=None)
109+
110+
def ask_gpt3(question):
111+
response = openai.Completion.create(
112+
engine="text-davinci-003",
113+
prompt=question,
114+
max_tokens=100
115+
)
116+
return response.choices[0].text.strip()
117+
118+
def answer_question(question: str):
119+
results = cache.check(question)
120+
if results:
121+
return results[0]
122+
else:
123+
answer = ask_gpt3(question)
124+
cache.store(question, answer)
125+
return answer
95126
```
96127
97-
### Creating Input Data
98-
#### Pandas DataFrame
99128
100-
more to come, see tests and sample-data for usage

docs/Makefile

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Minimal makefile for Sphinx documentation
2+
#
3+
4+
# You can set these variables from the command line, and also
5+
# from the environment for the first two.
6+
SPHINXOPTS ?=
7+
SPHINXBUILD ?= sphinx-build
8+
SOURCEDIR = .
9+
BUILDDIR = _build
10+
11+
# Put it first so that "make" without argument is like "make help".
12+
help:
13+
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
14+
15+
.PHONY: help Makefile
16+
17+
# Catch-all target: route all unknown targets to Sphinx using the new
18+
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
19+
%: Makefile
20+
21+
# build docs
22+
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
"""A directive to generate a gallery of images from structured data.
2+
3+
Generating a gallery of images that are all the same size is a common
4+
pattern in documentation, and this can be cumbersome if the gallery is
5+
generated programmatically. This directive wraps this particular use-case
6+
in a helper-directive to generate it with a single YAML configuration file.
7+
8+
It currently exists for maintainers of the pydata-sphinx-theme,
9+
but might be abstracted into a standalone package if it proves useful.
10+
"""
11+
from pathlib import Path
12+
from typing import Any, Dict, List
13+
14+
from docutils import nodes
15+
from docutils.parsers.rst import directives
16+
from sphinx.application import Sphinx
17+
from sphinx.util import logging
18+
from sphinx.util.docutils import SphinxDirective
19+
from yaml import safe_load
20+
21+
logger = logging.getLogger(__name__)
22+
23+
24+
TEMPLATE_GRID = """
25+
`````{{grid}} {grid_columns}
26+
{container_options}
27+
28+
{content}
29+
30+
`````
31+
"""
32+
33+
GRID_CARD = """
34+
````{{grid-item-card}} {title}
35+
{card_options}
36+
37+
{content}
38+
````
39+
"""
40+
41+
42+
class GalleryDirective(SphinxDirective):
43+
"""A directive to show a gallery of images and links in a grid."""
44+
45+
name = "gallery-grid"
46+
has_content = True
47+
required_arguments = 0
48+
optional_arguments = 1
49+
final_argument_whitespace = True
50+
option_spec = {
51+
# A class to be added to the resulting container
52+
"grid-columns": directives.unchanged,
53+
"class-container": directives.unchanged,
54+
"class-card": directives.unchanged,
55+
}
56+
57+
def run(self) -> List[nodes.Node]:
58+
if self.arguments:
59+
# If an argument is given, assume it's a path to a YAML file
60+
# Parse it and load it into the directive content
61+
path_data_rel = Path(self.arguments[0])
62+
path_doc, _ = self.get_source_info()
63+
path_doc = Path(path_doc).parent
64+
path_data = (path_doc / path_data_rel).resolve()
65+
if not path_data.exists():
66+
logger.warn(f"Could not find grid data at {path_data}.")
67+
nodes.text("No grid data found at {path_data}.")
68+
return
69+
yaml_string = path_data.read_text()
70+
else:
71+
yaml_string = "\n".join(self.content)
72+
73+
# Read in YAML so we can generate the gallery
74+
grid_data = safe_load(yaml_string)
75+
76+
grid_items = []
77+
for item in grid_data:
78+
# Grid card parameters
79+
options = {}
80+
if "website" in item:
81+
options["link"] = item["website"]
82+
83+
if "class-card" in self.options:
84+
options["class-card"] = self.options["class-card"]
85+
86+
if "img-background" in item:
87+
options["img-background"] = item["img-background"]
88+
89+
if "img-top" in item:
90+
options["img-top"] = item["img-top"]
91+
92+
if "img-bottom" in item:
93+
options["img-bottom"] = item["img-bottom"]
94+
95+
options_str = "\n".join(f":{k}: {v}" for k, v in options.items()) + "\n\n"
96+
97+
# Grid card content
98+
content_str = ""
99+
if "header" in item:
100+
content_str += f"{item['header']}\n\n^^^\n\n"
101+
102+
if "image" in item:
103+
content_str += f"![Gallery image]({item['image']})\n\n"
104+
105+
if "content" in item:
106+
content_str += f"{item['content']}\n\n"
107+
108+
if "footer" in item:
109+
content_str += f"+++\n\n{item['footer']}\n\n"
110+
111+
title = item.get("title", "")
112+
content_str += "\n"
113+
grid_items.append(
114+
GRID_CARD.format(
115+
card_options=options_str, content=content_str, title=title
116+
)
117+
)
118+
119+
# Parse the template with Sphinx Design to create an output
120+
container = nodes.container()
121+
# Prep the options for the template grid
122+
container_options = {"gutter": 2, "class-container": "gallery-directive"}
123+
if "class-container" in self.options:
124+
container_options[
125+
"class-container"
126+
] += f' {self.options["class-container"]}'
127+
container_options_str = "\n".join(
128+
f":{k}: {v}" for k, v in container_options.items()
129+
)
130+
131+
# Create the directive string for the grid
132+
grid_directive = TEMPLATE_GRID.format(
133+
grid_columns=self.options.get("grid-columns", "1 2 3 4"),
134+
container_options=container_options_str,
135+
content="\n".join(grid_items),
136+
)
137+
# Parse content as a directive so Sphinx Design processes it
138+
self.state.nested_parse([grid_directive], 0, container)
139+
# Sphinx Design outputs a container too, so just use that
140+
container = container.children[0]
141+
142+
# Add extra classes
143+
if self.options.get("container-class", []):
144+
container.attributes["classes"] += self.options.get("class", [])
145+
return [container]
146+
147+
148+
def setup(app: Sphinx) -> Dict[str, Any]:
149+
"""Add custom configuration to sphinx app.
150+
151+
Args:
152+
app: the Sphinx application
153+
Returns:
154+
the 2 parallel parameters set to ``True``.
155+
"""
156+
app.add_directive("gallery-grid", GalleryDirective)
157+
158+
return {
159+
"parallel_read_safe": True,
160+
"parallel_write_safe": True,
161+
}

docs/_static/.nojekyll

Whitespace-only changes.
41.2 KB
Loading
188 KB
Loading

docs/_static/apple-touch-icon.png

38.5 KB
Loading

docs/_static/css/custom.css

Whitespace-only changes.

docs/_static/favicon-16x16.png

817 Bytes
Loading

0 commit comments

Comments
 (0)