Components#
The hexagonal layout has three ports, one application service, and a composition root that wires them together.
Ports and adapters#
flowchart LR
orch["Orchestrator<br/>(qfa.services)"]
subgraph llmport["LLMPort"]
direction TB
tracking["TrackingLLMAdapter<br/>(decorator)"]
litellm["LiteLLMClient"]
tracking --> litellm
end
subgraph anonport["AnonymizationPort"]
presidio["PresidioAnonymizer"]
end
subgraph usageport["UsageRepositoryPort"]
sqlrepo["SqlAlchemyUsageRepository"]
end
orch -->|complete| tracking
orch -->|anonymize / deanonymize| presidio
tracking -.->|record_call| sqlrepo
routes_usage["/v1/usage<br/>route"] -->|get_usage_stats| sqlrepo
Port |
Adapter(s) |
What it owns |
|---|---|---|
|
One method, |
|
|
||
Writes one |
The tracking decorator is the only place hex’s “stack adapters at the composition root” earns its keep — TrackingLLMAdapter is itself an LLMPort, so the orchestrator never knows whether tracking is on.
The orchestrator#
Orchestrator is one class with four async methods, each backing one HTTP endpoint:
Method |
Endpoint |
What it does |
|---|---|---|
|
|
One LLM call. Free-text summary of themes across submitted records. |
|
|
One LLM call. Per-record summaries with a self-evaluated quality score. |
|
|
Two LLM calls (summary + judge). Single aggregate summary with a calibrated score. |
|
|
Multiple LLM calls per record: pick + judge at each level of a hierarchical coding framework. |
All four enter call_scope(tenant_id, operation) first — see Cross-cutting concerns for what that does.
Composition root#
qfa.api.app.create_app() builds the FastAPI instance; the lifespan context manager wires the dependency graph at startup. The wiring is roughly:
Load settings.
Construct the base
LLMPort(aLiteLLMClient).If
DB_TRACK_USAGEis on:Construct the
UsageRepositoryPort(aSqlAlchemyUsageRepository).Wrap the base
LLMPortin aTrackingLLMAdapterthat delegates to the inner port and records each call to the repository.
Construct the
AnonymizationPort(aPresidioAnonymizer).Construct the
Orchestratorwith the (possibly wrapped) ports.Stash the orchestrator, API keys, and — when present — the usage repository on
app.statefor the request lifecycle to read.
This is the only place that knows about concrete adapter classes. Routes and dependencies read from app.state only.
Test seam#
create_app(llm_factory=…) lets end-to-end tests inject a FakeLLMPort without monkey-patching. The lifespan still runs — so the real TrackingLLMAdapter, PresidioAnonymizer, and migrations all execute. Only the bottom-most layer (the actual LLM call) is faked. See tests/e2e/conftest.py.