Skip to content
141 changes: 141 additions & 0 deletions .agents/skills/mellea-logging/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
---
name: mellea-logging
description: >
Best-practices guide for adding or reviewing logging in the Mellea codebase.
Covers when to use log_context() vs a dedicated logger call, canonical field
names, reserved attribute constraints, async/thread safety, and what events
deserve dedicated log lines.
Use when: adding a new log call; reviewing a PR that touches MelleaLogger;
deciding where to inject context fields; debugging why a field is missing from
a log record; or ensuring consistency with the project logging conventions.
argument-hint: "[file-or-directory]"
compatibility: "Claude Code, IBM Bob"
metadata:
version: "2026-04-15"
capabilities: [read_file, grep, glob]
---

# Mellea Logging Best Practices

All logging in Mellea flows through `MelleaLogger.get_logger()`, defined in
`mellea/core/utils.py`. This skill documents the conventions for adding and
reviewing log instrumentation.

## Quick reference

```python
from mellea.core import MelleaLogger, log_context, set_log_context, clear_log_context

logger = MelleaLogger.get_logger()

# Dedicated log call — for a discrete event
logger.info("SUCCESS")

# Context injection — attach fields to every record in a scope
with log_context(request_id="req-abc", trace_id="t-1"):
logger.info("Starting generation") # includes request_id, trace_id
# ... all nested calls inherit these fields automatically
```

## When to add a dedicated log call

Use `logger.info/warning/error(...)` for **discrete, named events**:

| Event type | Level | Example |
|------------|-------|---------|
| Phase transition | INFO | `"SUCCESS"`, `"FAILED"`, `"Starting session"` |
| Loop progress | INFO | `"Running loop 2 of 3"` |
| Recoverable issue | WARNING | `"Warmup failed for model: ..."` |
| Unexpected failure | ERROR | exception tracebacks, hard failures |
| Verbose diagnostics | DEBUG | token counts, prompt previews |

Do **not** add log calls for:

- Values already captured in a `log_context` field (redundant noise)
- Internal helper functions where the calling function already logs the event
- State that is already reflected in telemetry spans

## When to use log_context

Use `log_context` (or `set_log_context`) to attach **identifiers and metadata
that should appear on every log record within a scope** — without threading
them through every call.

Typical injection points:

| Scope | Where to inject | Fields |
|-------|----------------|--------|
| Session lifetime | `MelleaSession.__enter__` | `session_id`, `backend`, `model_id` |
| Sampling loop | `BaseSamplingStrategy.sample()` | `strategy`, `loop_budget` |
| HTTP request handler | entry point of the handler | `request_id`, `trace_id` |
| Background task | top of the task coroutine | `task_id`, `job_name` |

## Canonical field names

Use these names consistently. Do not invent synonyms.

| Field | Type | Description |
|-------|------|-------------|
| `session_id` | str (UUID) | Unique ID for a `MelleaSession` |
| `backend` | str | Backend class name, e.g. `"OllamaModelBackend"` |
| `model_id` | str | Model identifier string |
| `strategy` | str | Sampling strategy class name |
| `loop_budget` | int | Max generate/validate cycles for this sampling call |
| `request_id` | str | Caller-supplied request identifier |
| `trace_id` | str | Distributed trace ID (from OpenTelemetry or caller) |
| `span_id` | str | Span ID within a trace |
| `user_id` | str | End-user identifier (when applicable) |

## Reserved attribute names — do not use as context fields

The following names are standard `logging.LogRecord` attributes. Passing them
to `log_context()` or `set_log_context()` raises `ValueError`. See
`RESERVED_LOG_RECORD_ATTRS` in `mellea/core/utils.py` for the full set.

`args`, `created`, `exc_info`, `exc_text`, `filename`, `funcName`,
`levelname`, `levelno`, `lineno`, `message`, `module`, `msecs`, `msg`,
`name`, `pathname`, `process`, `processName`, `relativeCreated`,
`stack_info`, `thread`, `threadName`

## Prefer the context manager over set/clear

```python
# Preferred — guaranteed cleanup even on exceptions
with log_context(trace_id="abc"):
do_work()

# Acceptable only when lifetime equals __enter__/__exit__
# (e.g. MelleaSession, where the CM already guarantees cleanup)
set_log_context(session_id=self.id)
# ... later in __exit__ ...
clear_log_context()
```

The context manager uses a `ContextVar` token to restore the previous state
on exit. This means **nesting works correctly** — inner calls can add fields
without clobbering the outer scope's values.

## Async and thread safety

`log_context` uses `contextvars.ContextVar`, which is safe for concurrent
asyncio tasks:

- Each `asyncio.Task` gets its own copy of the context.
- Fields set in one task do not bleed into sibling tasks.

**Plugin hooks**: Mellea hooks (`AUDIT`, `SEQUENTIAL`, `CONCURRENT`) are
`await`ed in the same asyncio task as the call site. `ContextVar` state IS
inherited — fields set around a `strategy.sample()` call will appear on
records emitted inside hook handlers automatically.

## Checklist before committing

1. New log calls use `MelleaLogger.get_logger()`, not `logging.getLogger(...)`.
2. Context fields use canonical names from the table above.
3. No reserved attribute names passed to `log_context`.
4. Scoped fields use `with log_context(...)`, not `set_log_context` (unless
managing an `__enter__`/`__exit__` pair).
5. Hook handlers that need context set it internally — they do not inherit the
caller's context.
6. New events that span multiple log records inject fields via context, not by
repeating them on every `logger.info(...)` call.
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -439,8 +439,8 @@ CICD=1 uv run pytest

```python
# Enable debug logging
from mellea.core import FancyLogger
FancyLogger.get_logger().setLevel("DEBUG")
from mellea.core import MelleaLogger
MelleaLogger.get_logger().setLevel("DEBUG")

# See exact prompt sent to LLM
print(m.last_prompt())
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@

from mellea import MelleaSession
from mellea.backends import ModelOption
from mellea.core import FancyLogger
from mellea.core import MelleaLogger
from mellea.stdlib.components import Message

from .._prompt_modules import PromptModule, PromptModuleString
from ._exceptions import BackendGenerationError, TagExtractionError
from ._prompt import get_system_prompt, get_user_prompt
from ._types import SubtaskPromptConstraintsItem

FancyLogger.get_logger().setLevel("DEBUG")
MelleaLogger.get_logger().setLevel("DEBUG")

T = TypeVar("T")

Expand Down Expand Up @@ -175,7 +175,7 @@ def _default_parser(generated_str: str) -> list[SubtaskPromptConstraintsItem]:
).strip()
else:
# Fallback to raw text if tags are missing
FancyLogger.get_logger().warning(
MelleaLogger.get_logger().warning(
"Expected tags missing from LLM response; falling back to raw response text. "
"Downstream stages may receive unstructured content."
)
Expand All @@ -202,7 +202,7 @@ def _default_parser(generated_str: str) -> list[SubtaskPromptConstraintsItem]:
# If content exists but no list items were parsed,
# treat the whole text as a single constraint.
if subtask_constraint_assign_str and not subtask_constraint_assign:
FancyLogger.get_logger().warning(
MelleaLogger.get_logger().warning(
"No list-style constraints detected; falling back to full text as a single constraint."
)
subtask_constraint_assign = [subtask_constraint_assign_str]
Expand Down
4 changes: 2 additions & 2 deletions docs/AGENTS_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,8 +160,8 @@ Session methods: `ainstruct`, `achat`, `aact`, `avalidate`, `aquery`, `atransfor

#### 12. Debugging
```python
from mellea.core import FancyLogger
FancyLogger.get_logger().setLevel("DEBUG")
from mellea.core import MelleaLogger
MelleaLogger.get_logger().setLevel("DEBUG")
```
- `m.last_prompt()` — see exact prompt sent

Expand Down
4 changes: 2 additions & 2 deletions docs/docs/community/contributing-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -315,10 +315,10 @@ CICD=1 uv run pytest
### Debugging tips

```python
from mellea.core import FancyLogger
from mellea.core import MelleaLogger

# Enable debug logging
FancyLogger.get_logger().setLevel("DEBUG")
MelleaLogger.get_logger().setLevel("DEBUG")

# Inspect the exact prompt sent to the LLM
print(m.last_prompt())
Expand Down
68 changes: 59 additions & 9 deletions docs/docs/evaluation-and-observability/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,22 +14,23 @@ Both work simultaneously when enabled.

## Console logging

Mellea uses `FancyLogger`, a color-coded singleton logger built on Python's
Mellea uses `MelleaLogger`, a color-coded singleton logger built on Python's
`logging` module. All internal Mellea modules obtain their logger via
`FancyLogger.get_logger()`.
`MelleaLogger.get_logger()`.

### Configuration

| Variable | Description | Default |
| -------- | ----------- | ------- |
| `DEBUG` | Set to any value to enable `DEBUG`-level output | unset (`INFO` level) |
| `FLOG` | Set to any value to forward logs to a local REST endpoint at `http://localhost:8000/api/receive` | unset |
| `MELLEA_LOG_LEVEL` | Log level name (e.g. `DEBUG`, `INFO`, `WARNING`) | `INFO` |
| `MELLEA_LOG_JSON` | Set to any truthy value (`1`, `true`, `yes`) to emit structured JSON instead of colour-coded output | unset |
| `MELLEA_FLOG` | Set to any value to forward logs to a local REST endpoint at `http://localhost:8000/api/receive` | unset |

By default, `FancyLogger` logs at `INFO` level with color-coded output to
stdout. Set the `DEBUG` environment variable to lower the level to `DEBUG`:
By default, `MelleaLogger` logs at `INFO` level with color-coded output to
stdout. Set `MELLEA_LOG_LEVEL` to change the level:

```bash
export DEBUG=1
export MELLEA_LOG_LEVEL=DEBUG
python your_script.py
```

Expand All @@ -50,6 +51,55 @@ Each message is formatted as:
message
```

## Sample output

### Console format (default)

Running `m.instruct(...)` inside a session produces lines like:

```text
=== 11:11:25-INFO ======
SUCCESS
```

### JSON format (`MELLEA_LOG_JSON=1`)

With structured JSON output enabled, the same `SUCCESS` record looks like:

```json
{
"timestamp": "2026-04-08T11:11:25",
"level": "INFO",
"message": "SUCCESS",
"module": "base",
"function": "sample",
"line_number": 258,
"process_id": 73738,
"thread_id": 6179762176,
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"backend": "OllamaModelBackend",
"model_id": "granite4:micro",
"strategy": "RejectionSamplingStrategy",
"loop_budget": 3
}
```

The `session_id`, `backend`, `model_id`, `strategy`, and `loop_budget` fields
are injected automatically when the call runs inside a `with session:` block.
They appear on every log record within that scope.

### Adding custom context fields

Use `log_context` to attach your own fields for the duration of a block:

```python
from mellea.core import log_context

with log_context(request_id="req-abc", user_id="usr-42"):
result = m.instruct("Summarise this document")
# Every log record emitted here will include request_id and user_id
```

## OTLP log export

When the `[telemetry]` extra is installed, Mellea can export logs to an OTLP
Expand All @@ -74,11 +124,11 @@ export OTEL_SERVICE_NAME=my-mellea-app

### How it works

When `MELLEA_LOGS_OTLP=true`, `FancyLogger` adds an OpenTelemetry
When `MELLEA_LOGS_OTLP=true`, `MelleaLogger` adds an OpenTelemetry
`LoggingHandler` alongside its existing handlers:

- **Console handler** — continues to work normally (color-coded output)
- **REST handler** — continues to work normally (when `FLOG` is set)
- **REST handler** — continues to work normally (when `MELLEA_FLOG` is set)
- **OTLP handler** — exports logs to the configured OTLP collector

Logs are exported using OpenTelemetry's Logs API with batched processing
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/evaluation-and-observability/telemetry.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ and troubleshooting.

## Logging

Mellea uses a color-coded console logger (`FancyLogger`) by default. When the
Mellea uses a color-coded console logger (`MelleaLogger`) by default. When the
`[telemetry]` extra is installed and `MELLEA_LOGS_OTLP=true` is set, Mellea
also exports logs to an OTLP collector alongside existing console output.

Expand Down
4 changes: 2 additions & 2 deletions docs/examples/agents/react/react_from_scratch/react.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@

import mellea
import mellea.stdlib.components.chat
from mellea.core import FancyLogger
from mellea.core import MelleaLogger
from mellea.stdlib.context import ChatContext

FancyLogger.get_logger().setLevel("ERROR")
MelleaLogger.get_logger().setLevel("ERROR")

react_system_template: Template = Template(
"""Answer the user's question as best you can.
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/mcp/mcp_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from mellea import MelleaSession
from mellea.backends import ModelOption, model_ids
from mellea.backends.ollama import OllamaModelBackend
from mellea.core import FancyLogger, ModelOutputThunk, Requirement
from mellea.core import MelleaLogger, ModelOutputThunk, Requirement
from mellea.stdlib.requirements import simple_validate
from mellea.stdlib.sampling import RejectionSamplingStrategy

Expand Down
4 changes: 2 additions & 2 deletions docs/examples/mify/rich_table_execute_basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@

from mellea import start_session
from mellea.backends import ModelOption, model_ids
from mellea.core import FancyLogger
from mellea.core import MelleaLogger
from mellea.stdlib.components.docs.richdocument import RichDocument, Table

FancyLogger.get_logger().setLevel("ERROR")
MelleaLogger.get_logger().setLevel("ERROR")

"""
Here we demonstrate the use of the (internally m-ified) class
Expand Down
4 changes: 2 additions & 2 deletions docs/examples/sofai/sofai_graph_coloring.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@

import mellea
from mellea.backends.ollama import OllamaModelBackend
from mellea.core import FancyLogger
from mellea.core import MelleaLogger
from mellea.stdlib.components import Message
from mellea.stdlib.context import ChatContext
from mellea.stdlib.requirements import ValidationResult, req
Expand Down Expand Up @@ -230,5 +230,5 @@ def main():

if __name__ == "__main__":
# Set logging level
FancyLogger.get_logger().setLevel(logging.INFO)
MelleaLogger.get_logger().setLevel(logging.INFO)
main()
2 changes: 1 addition & 1 deletion docs/metrics/coverage-baseline.json
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
"ComponentParseError",
"Context",
"ContextTurn",
"FancyLogger",
"MelleaLogger",
"Formatter",
"GenerateLog",
"GenerateType",
Expand Down
Loading
Loading