Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,9 @@ wheels/
.gradio
.adk
*.db
*.db-journal
implementations/report_generation/reports/*
implementations/report_generation/data/*.zip
implementations/report_generation/data/*.csv
implementations/report_generation/data/*.xls
implementations/report_generation/data/*.xlsx
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,8 @@ def get_report_generation_agent(
model=client_manager.configs.default_worker_model,
instruction=instructions,
tools=[
db_manager.report_generation_db().execute,
db_manager.report_generation_db().get_schema_info,
db_manager.report_generation_db().execute,
report_file_writer.write_xlsx,
],
after_agent_callback=after_agent_callback,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -362,25 +362,29 @@ async def run_agent_with_retry(agent: Agent, agent_input: str) -> list[Event]:
list[Event]
The events from the agent run.
"""
logger.info(f"Running agent {agent.name} with input '{agent_input[:100]}...'")

# Create session and runner
session_service = InMemorySessionService()
runner = Runner(app_name=agent.name, agent=agent, session_service=session_service)
current_session = await session_service.create_session(
app_name=agent.name,
user_id="user",
state={},
)

# create the user message and run the agent
content = Content(role="user", parts=[Part(text=agent_input)])
events = []
async for event in runner.run_async(
user_id="user",
session_id=current_session.id,
new_message=content,
):
events.append(event)
try:
logger.info(f"Running agent {agent.name} with input '{agent_input[:100]}...'")

# Create session and runner
session_service = InMemorySessionService()
runner = Runner(app_name=agent.name, agent=agent, session_service=session_service)
current_session = await session_service.create_session(
app_name=agent.name,
user_id="user",
state={},
)

return events
# create the user message and run the agent
content = Content(role="user", parts=[Part(text=agent_input)])
events = []
async for event in runner.run_async(
user_id="user",
session_id=current_session.id,
new_message=content,
):
events.append(event)

return events
except Exception as e:
logger.error(f"Error running agent {agent.name} with input '{agent_input[:100]}...': {e}")
raise e # raising the exception so the retry mechanism can try again
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ def __init__(self, reports_output_path: Path):

def write_xlsx(
self,
report_data: list[Any],
report_data: list[list[Any]],
report_columns: list[str],
filename: str = "report.xlsx",
gradio_link: bool = True,
Expand All @@ -47,7 +47,7 @@ def write_xlsx(

Parameters
----------
report_data : list[Any]
report_data : list[list[Any]]
The data of the report.
report_columns : list[str]
The columns of the report.
Expand Down
195 changes: 195 additions & 0 deletions implementations/report_generation/01_Importing_the_Dataset.ipynb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally not in favour of committing the outputs of cells. It just adds a lot more clutter to the git history, and usually it can change between runs as well. So consider clearing the outputs and only commit the code.

Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "cc016c29-e5e8-4338-9a31-0fa6700505ff",
"metadata": {},
"source": [
"# Importing the Dataset for the Report Generation Agent\n",
"\n",
"This notebook implements the **data import** for the **Report Generation Agent** for single-table relational\n",
"data source.\n",
"\n",
"The data source implemented here is an [SQLite](https://sqlite.org/) database which is supported\n",
"natively by Python and saves the data in disk.\n",
"[SQLAlchemy](https://www.sqlalchemy.org/) is used as a SQL connection tool so this\n",
"SQL connection can be easily swapped for other databases.\n",
"\n",
"The SQL Alchemy tool is set up to allow **read-only queries**, so there is **no risk** the agent runs queries that can modify the DB data."
]
},
{
"cell_type": "markdown",
"id": "8499a56f-716f-47a6-b255-8bdbbe0fd777",
"metadata": {},
"source": [
"## Setting up\n",
"\n",
"The code below sets the notebook default folder, sets the default constants and checks the presence of the environment variables.\n",
"\n",
"The environment variables can be set in the `.env` file in the root folder of the project."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4cc2db20-296f-4822-916c-b8255073c066",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import ssl\n",
"import urllib.request\n",
"import zipfile\n",
"from pathlib import Path\n",
"\n",
"import certifi\n",
"import pandas as pd\n",
"from aieng.agent_evals.async_client_manager import AsyncClientManager\n",
"\n",
"\n",
"# Setting the notebook directory to the project's root folder\n",
"if Path(\"\").absolute().name == \"eval-agents\":\n",
" print(f\"Notebook path is already the root path: {Path('').absolute()}\")\n",
"else:\n",
" os.chdir(Path(\"\").absolute().parent.parent)\n",
" print(f\"The notebook path has been set to: {Path('').absolute()}\")\n",
"\n",
"client_manager = AsyncClientManager.get_instance()\n",
"assert client_manager.configs.report_generation_db.database, (\n",
" \"[ERROR] The database path is not set! Please configure the REPORT_GENERATION_DB__DATABASE environment variable.\"\n",
")\n",
"\n",
"print(\"All environment variables have been set.\")\n",
"\n",
"DATA_FOLDER = Path(\"implementations/report_generation/data\")\n",
"DATASET_PATH = DATA_FOLDER / \"OnlineRetail.csv\"\n",
"\n",
"from implementations.report_generation.data.import_online_retail_data import import_online_retail_data # noqa: E402"
]
},
{
"cell_type": "markdown",
"id": "0aa0bdf0-a7ba-4458-868b-07d627b12ed9",
"metadata": {},
"source": [
"## Dataset\n",
"\n",
"The dataset used in this example is the\n",
"**[Online Retail](https://archive.ics.uci.edu/dataset/352/online+retail) dataset**. It contains\n",
"information about **invoices** for products that were purchased by customers, which also includes\n",
"product quantity, the invoice date and the country that the customer resides in. For a more\n",
"detailed data structure, please check the [OnlineRetail.ddl](http://localhost:8888/lab/tree/implementations/report_generation/data/OnlineRetail.ddl) file."
]
},
{
"cell_type": "markdown",
"id": "553dceaa-8fe7-4e4f-9940-b2d1e8d8d6ee",
"metadata": {},
"source": [
"## Downloading the Dataset\n",
"\n",
"The code below will **download and unzip** the dataset to the `implementations/report_generation/data/` folder."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "554f1cc6-c42f-4fe3-8857-214fcbeafd95",
"metadata": {},
"outputs": [],
"source": [
"url = \"https://archive.ics.uci.edu/static/public/352/online+retail.zip\"\n",
"zip_file_path = DATA_FOLDER / \"online_retail.zip\"\n",
"xlsx_file_path = DATA_FOLDER / \"Online Retail.xlsx\"\n",
"\n",
"print(\"Downloading the dataset...\")\n",
"ctx = ssl.create_default_context(cafile=certifi.where())\n",
"req = urllib.request.Request(url)\n",
"with urllib.request.urlopen(req, context=ctx) as resp, open(zip_file_path, \"wb\") as f:\n",
" f.write(resp.read())\n",
"\n",
"print(\"Extracting the dataset file...\")\n",
"with zipfile.ZipFile(zip_file_path, \"r\") as zf:\n",
" zf.extractall(DATA_FOLDER)\n",
"\n",
"print(\"Converting the dataset file from .xls to .csv...\")\n",
"df = pd.read_excel(xlsx_file_path)\n",
"df.to_csv(DATASET_PATH, index=False)\n",
"\n",
"print(\"Done!\")"
]
},
{
"cell_type": "markdown",
"id": "2e4e45de-1f07-41de-bf9c-f9c2543b8cb3",
"metadata": {},
"source": [
"## Visualizing the data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "123d37f3-fd6f-4676-8f84-8bcfa45a0535",
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv(DATASET_PATH)\n",
"df # noqa: B018"
]
},
{
"cell_type": "markdown",
"id": "ec3a84cb-3636-460a-a9bd-e6ea57d3f9d7",
"metadata": {},
"source": [
"## Importing the Data\n",
"\n",
"The code below will import the `.csv` dataset to the database at the path set by the `REPORT_GENERATION_DB__DATABASE` environment variable."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ee28609b-6eed-4aea-a5e0-c7d6df57e0af",
"metadata": {},
"outputs": [],
"source": [
"import_online_retail_data(DATASET_PATH)\n",
"print(\"Done!\")"
]
},
{
"cell_type": "markdown",
"id": "ce2ab32d-859f-44b1-9c54-cf9c2c1f1da4",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"Now the data should be ready to be consumed by the agent on the **next notebook**."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading