Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
5f1b4d0
[E-Document] Add Purchase Credit Memo support for PEPPOL CreditNote i…
Mar 31, 2026
92d6468
[E-Document] Address PR review: extract shared FinishDraft logic, fix…
Mar 31, 2026
29d7ff6
[E-Document] Complete PEPPOL handler: fix CreditNote DueDate, add att…
Mar 31, 2026
0b4edda
[E-Document] Add PEPPOL handler test coverage for completeness items
Mar 31, 2026
0b7c951
[E-Document] Fix PEPPOL handler: RegistrationName fallback, GLN test …
Mar 31, 2026
a0793e7
[E-Document] Fix CreditNote test XML: move DueDate to PaymentMeans pe…
Mar 31, 2026
c85f246
[E-Document] Set Applies-to Doc. No. on Purchase Credit Memo from Bil…
Mar 31, 2026
71cdf00
[E-Document] Refactor PEPPOL handler: move extraction logic to utility
Mar 31, 2026
f32e969
telemetry tags
Apr 1, 2026
06ea094
[E-Document] Add Data Exchange handler skeleton and enum registration
Apr 1, 2026
e39e06b
[E-Document] Add Data Exchange v2 handler test XML resources
Apr 1, 2026
af5641c
[E-Document] Implement Data Exchange v2 bridge: auto-detection, field…
Apr 1, 2026
3906be3
[E-Document] Add Data Exchange v2 handler tests
Apr 1, 2026
a855360
[E-Document] Fix auto-detection: remove Commit()/TryFunction for pipe…
Apr 1, 2026
6e33875
[E-Document] Fix Data Exchange handler: namespace-based def matching,…
Apr 1, 2026
2d8526c
Merge main into magnushar/edoc-import-v2-data-exchange
Apr 8, 2026
0eed61a
Remove duplicate pre-rename codeunit files (IDs 6403, 6406)
Apr 9, 2026
5cf75c1
Fix build errors: unused method/param, unnecessary begin..end, text o…
Apr 10, 2026
bb214ab
Fix AA0181/AA0233: use FindSet instead of FindFirst when followed by …
Apr 10, 2026
207f2e8
Fix AA0181/AA0233: use repeat...until loop for FindSet+Next
Apr 10, 2026
62951af
Add error diagnostics to test ProcessEDocumentToStep
Apr 10, 2026
58f6fc8
Fix AL0185: use E-Document Log instead of Error Message table
Apr 10, 2026
07acb20
Fix AL0132: use correct field name Status on E-Document Log
Apr 10, 2026
9048edb
Fix AA0217: replace StrSubstNo with string concatenation
Apr 10, 2026
9b21a6a
Ensure PEPPOL Data Exchange Definitions exist in test setup
Apr 10, 2026
4e8cb64
Rename Vendor Invoice No. to Applies-to Ext. Invoice No. and align PE…
Apr 13, 2026
9240933
Rename Data Exch. Handler to PEPPOL DX Handler and thread DocType
Apr 13, 2026
0cff1d7
Rename file to match object name E-Doc. PEPPOL DX Handler
Apr 13, 2026
d78d45a
Add v2 Data Exchange Definitions and use ProcessDataExchange
Apr 14, 2026
79a1b59
Remap v2 Data Exchange defs to staging tables and replace hardcoded b…
Apr 14, 2026
778f6e0
Fix Data Exch. Def codes to ≤20 chars and remove file name spaces
Apr 14, 2026
34d5c0d
Merge branch 'main' of https://github.com/microsoft/BCApps into magnu…
May 18, 2026
849cdca
Fix test
May 19, 2026
f7ae9de
More tests
May 19, 2026
cecc420
Fix AA0175: use IsEmpty instead of FindFirst for existence check
May 20, 2026
e0582aa
Merge remote-tracking branch 'origin/main' into magnushar/edoc-import…
May 20, 2026
555e94b
Fix AL0118: use Subc. Standard Task Code field on Requisition Line
May 20, 2026
c2934b9
Fix AL0118/AL0132: rename Standard Task Code on Requisition Line in test
May 20, 2026
3866420
Revert "Fix AL0118/AL0132: rename Standard Task Code on Requisition L…
May 20, 2026
2541ac4
Revert "Fix AL0118: use Subc. Standard Task Code field on Requisition…
May 20, 2026
a96522c
Merge remote-tracking branch 'origin/main' into magnushar/edoc-import…
May 20, 2026
a0c15e8
Trigger CI re-run after infrastructure failures
May 20, 2026
2bd50d7
Address PR review feedback: cleanup, error handling, vendor filter
May 21, 2026
84afd82
Move PEPPOL-specific post-processing to Data Exchange Post-Mapping co…
May 21, 2026
b3664ea
Remove TryFunction from pipeline - incompatible with DB inserts in ca…
May 21, 2026
1d308a3
Fix charge line ordering: move MapChargeLinesToStaging to event subsc…
May 21, 2026
516c901
Refactor: generic Data Exchange purchase import handler
May 21, 2026
01c317a
Rename V2 Data Exchange defs and add upgrade step
May 21, 2026
ed2013a
Fix order
May 21, 2026
097f4da
Remove commented-out draft code from IStructuredDataType interface
May 21, 2026
cc57e8e
Add MLLM V2 agentic extraction design spec
May 26, 2026
703b6a9
Increase MLLM V2 tool call budget to 200
May 26, 2026
db91c46
Add MLLM V2 implementation plan
May 26, 2026
986c0eb
Redesign Task 4 prompt: agent understands invoice structure, no hardc…
May 26, 2026
d725559
Add "MLLM V2" enum value for agentic extraction handler
May 26, 2026
cdf68ab
Add EDocMLLMVerifyTools with 6 verification methods and unit tests
May 26, 2026
8e351d2
Add 6 AOAI Function tool adapters for MLLM V2 verification
May 26, 2026
6f8b95f
Add MLLM V2 system prompt
May 26, 2026
93d7b7c
Add EDocumentMLLMHandlerV2 with agentic plan-act-verify loop
May 26, 2026
9bf70c5
Bump E-Document Core version for MLLM V2
May 26, 2026
4f60c63
Fix codeunit IDs (6110-6114, 6153-6154, 6405) and add missing System.…
May 27, 2026
2e6656a
Renumber MLLM V2 codeunits to 6311-6318 (within idRanges [6311..6331])
May 27, 2026
0bcbd49
Fix AA0215 file renames, remove unused Telemetry/FeatureNameLbl/GetIn…
May 27, 2026
b10838e
Remove [NonDebuggable] from all System Application AI module procedures
May 27, 2026
6920aa4
Scope [NonDebuggable] to GetUserPromptText only; CallMLLMV2 is fully …
May 27, 2026
0be0ddf
Isolate .Unwrap() to tiny [NonDebuggable] UnwrapSecret helper; all ot…
May 27, 2026
d0c7cee
Fix agentic loop: remove redundant AppendFunctionResponsesToChatMessa…
May 27, 2026
3f4a524
Improve verify feedback: implied gross price hint + explicit correcti…
May 27, 2026
2051768
Return Text from Execute() so Format() produces readable JSON for the…
May 27, 2026
ef1e347
Generalise VerifyLineMath error: remove invoice-specific hints, keep …
May 27, 2026
06e4acf
Prefer allowance_charge.percent over amount.value for line discount c…
May 27, 2026
9341963
Cap line discount at subtotal instead of erroring; allows draft page …
May 27, 2026
b46b6eb
Add EDocMLLMExtractionPlan state codeunit and plan tools (analyze_inv…
May 27, 2026
4133a2b
Wire verify tools to auto-mark extraction plan; register plan tools i…
May 27, 2026
f6411ac
Update V2 system prompt: enforce analyze_invoice first, checklist-dri…
May 27, 2026
8ff221a
Prompt: make checklist the explicit driver of Phase 3, not just guidance
May 27, 2026
f92141b
Add mark_item tool: model explicitly tracks checklist state
May 27, 2026
f592195
Remove auto-marking from verify tools: they are now pure validators
May 27, 2026
c338a07
Register mark_item tool; update prompt to require explicit mark_item …
May 27, 2026
ba43f15
Fix null in get_checklist GetPrompt: add empty properties to paramete…
May 27, 2026
2df9aed
Fix history window: expand HistoryLength for assistant tool-call mess…
May 27, 2026
db925c5
Revert "Fix history window: expand HistoryLength for assistant tool-c…
May 27, 2026
cfbb8a9
Prompt: explicitly instruct model to read totals section from documen…
May 27, 2026
69c744d
Add verify_payable tool; fix verify_invoice_totals for header discoun…
May 27, 2026
80d62af
Add submit_extraction tool; store verified JSON in plan state
May 27, 2026
ddcf7fd
Register new tools in handler; SetHistoryLength 500; use plan JSON as…
May 27, 2026
6961ab9
Update prompt: submit_extraction replaces JSON output; add verify_pay…
May 27, 2026
121aeeb
Fix submit_extraction: handle JSON passed as object token, not just s…
May 27, 2026
302a530
Fix prompt contradiction: model outputs JSON as final response AND ca…
May 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,439 changes: 1,439 additions & 0 deletions docs/superpowers/plans/2026-05-26-mllm-v2-implementation.md

Large diffs are not rendered by default.

152 changes: 152 additions & 0 deletions docs/superpowers/specs/2026-05-26-mllm-v2-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# E-Document MLLM Extraction V2 — Design Spec

**Date:** 2026-05-26
**Status:** Draft
**Replaces:** `EDocumentMLLMHandler.Codeunit.al` (V1)

---

## Problem

V1 performs a single-pass extraction: one AOAI call, one system prompt, one UBL JSON response. It has no mechanism to detect or correct errors it is confident about. Known failure modes observed in production:

- **Locale number formats** — Swedish `"2,34"` extracted as `234` (comma stripped by `AsDecimal()`)
- **Discount ambiguity** — invoice shows both gross price (`Pris`) and net price (`Pris efter rab.`); model uses net price AND applies a discount percentage, double-counting the discount
- **Silent wrong values** — extraction passes schema validation but produces semantically wrong output (wrong totals, wrong unit prices)

The root cause is that V1 sweeps the document left-to-right without understanding its structure, and has no self-correction capability.

---

## Solution: Plan-Act-Verify Agentic Loop

A single agentic AOAI call where the agent:

1. **Plans** — identifies document structure as chain-of-thought reasoning (regions, column roles, locale, flags) *before* extracting any values
2. **Acts** — extracts from the identified regions, guided by the structural understanding from the plan step
3. **Verifies** — calls deterministic AL-implemented tools to check its own output; self-corrects if tools report failures; repeats until all tools pass or the tool call budget is exhausted

The loop is entirely inside the model's reasoning turn. AL code sets up the tools and runs the agent; it does not orchestrate the plan/act/verify sequence.

---

## Architecture

### Single Agentic Call

```
PDF (base64)
┌─────────────────────────────────────────────────────┐
│ AGENT REASONING (one AOAI call, tool-use loop) │
│ │
│ 1. PLAN (chain-of-thought) │
│ "This is a Swedish invoice. Columns: Antal, │
│ Pris, Rabatt, Rabatt, Pris efter rab., Belopp. │
│ Decimal sep = comma. Two chained discount cols. │
│ Net price column present." │
│ │
│ 2. ACT (targeted extraction) │
│ Extract from identified regions using column │
│ roles, not left-to-right text sweep. │
│ │
│ 3. VERIFY (tool calls) │
│ verify_line_math() verify_totals() │
│ verify_vat() verify_dates() │
│ verify_required() verify_ranges() │
│ │
│ On failure → agent reads error, re-extracts, │
│ calls tools again. Loops until pass or budget. │
└─────────────────────────────────────────────────────┘
│ │
▼ ▼
Verified UBL JSON Error (budget
→ BC Purchase Draft exhausted)
```

**Model:** GPT-4.1 Mini (chosen for vision capability — the agent reads the PDF visually, not as extracted text)
**Tool call budget:** 200
**Temperature:** 0

### On Budget Exhaustion

E-Document status set to `Error`. A log entry records which verify check was still failing on the last iteration. No draft is created. ADI is not used as a fallback for verify failures.

ADI fallback is retained only for AOAI call failures (network error, content filter, empty response) — the same signal V1 uses today.

---

## New AL Components

### `EDocMLLMHandlerV2.Codeunit.al`

Implements `IStructureReceivedEDocument` (same interface as V1). Registered as enum value `"MLLM V2"` on `"Structure Data Impl."` — existing services using `"MLLM"` are unaffected until explicitly migrated.

Responsibilities:
- Build the AOAI chat messages (system prompt + PDF user message)
- Register the 6 verify tools as AOAI function definitions (`AOAITools`)
- Run the agentic dispatch loop in AL:
1. Call `GenerateChatCompletion`
2. If response contains tool call requests: execute via `EDocMLLMVerifyTools`, append results to `AOAIChatMessages`, increment call counter, go to 1
3. If response contains no tool calls (model is done): extract final JSON
4. If call counter exceeds budget: surface error
- On success: pass the final JSON to the existing `EDocMLLMSchemaHelper.MapHeaderFromJson` / `MapLinesFromJson` pipeline unchanged

The tool dispatch loop runs in AL, not inside the SDK. Each iteration is a new `GenerateChatCompletion` call with the tool results appended to the conversation history.

### `EDocMLLMVerifyTools.Codeunit.al`

Six methods, each returning `JsonObject` with `{ "pass": bool, "error": string }`:

| Tool | Inputs | Check |
|------|--------|-------|
| `verify_line_math` | unit_price, quantity, discount_pct, line_extension_amount | `unit_price × qty × (1 − disc/100) ≈ line_total` (within 1% relative tolerance) |
| `verify_invoice_totals` | line_amounts[], tax_exclusive_amount | `sum(lines) ≈ sub_total` (within 1% relative tolerance) |
| `verify_vat` | tax_exclusive_amount, vat_rate, tax_amount | `sub_total × rate/100 ≈ tax_amount` (within 1% relative tolerance) |
| `verify_dates` | issue_date, due_date | Both parse as valid dates; `due_date ≥ issue_date`; year in 1900–2100 |
| `verify_required_fields` | vendor_name, invoice_no, line_count | None are blank/zero |
| `verify_ranges` | quantities[], prices[], vat_rates[], discount_pcts[] | All > 0 (qty, price); 0–100 (vat, discount) |

Numeric tolerance for amount comparisons: 1% relative (`|expected − actual| / max(|actual|, 1) < 0.01`). A fixed absolute tolerance fails on large-quantity invoices where per-unit rounding accumulates (e.g. 1083 items × 0.005 rounding = 5.4 max error).

### `EDocMLLMExtractionV2-SystemPrompt.md`

New prompt resource. Three explicit sections:

1. **Structure identification** — "Before extracting any values, describe in your reasoning: document type, language, decimal separator, thousands separator, line item table column names and their roles (gross price, discount %, net price, quantity, line total), header and totals regions, and any flags (e.g. multiple discount columns, net price column present)."

2. **Targeted extraction** — "Extract data from the regions you identified. Do not sweep left-to-right across the full page. Use the column roles you identified to assign values correctly. Use XML decimal format (period as decimal separator, no thousands separators)."

3. **Verification** — "After producing the UBL JSON, call the verify tools on your output. If any tool reports a failure, read the error message, correct the relevant fields, and call the tools again. Finalize only when all tools pass."

---

## What Is Unchanged

- `EDocMLLMSchemaHelper.Codeunit.al` — `MapHeaderFromJson`, `MapLinesFromJson`, `GetDecimal` (with `Evaluate(..., 9)` from the V1 fix), `GetDate`
- `ubl_example.json` — UBL schema template (updated by V1 fix to use numeric `0` placeholders)
- `EDocMLLMHandler.Codeunit.al` — V1 stays registered under `"MLLM"` until removed

---

## What Is Retired

V1 (`"MLLM"` enum value and `EDocumentMLLMHandler.Codeunit.al`) is not removed in this change — existing service configurations keep working. A follow-up cleanup removes V1 once all services have migrated to `"MLLM V2"`.

---

## Error Flow

```
AOAI call fails entirely → FallbackToADI() (existing path)
AOAI returns bad JSON → FallbackToADI() (existing path)
Verify tools never all pass → EDocument.Status = Error + log entry
Vendor fields missing (schema) → FallbackToADI() (existing V1 ValidateMLLMResponse path)
```

---

## Open Questions

None — all design decisions confirmed.
Loading
Loading