# Components
The hexagonal layout has three ports, one application service, and a composition root that wires them together.
## Ports and adapters
```mermaid
flowchart LR
orch["Orchestrator
(qfa.services)"]
subgraph llmport["LLMPort"]
direction TB
tracking["TrackingLLMAdapter
(decorator)"]
litellm["LiteLLMClient"]
tracking --> litellm
end
subgraph anonport["AnonymizationPort"]
presidio["PresidioAnonymizer"]
end
subgraph usageport["UsageRepositoryPort"]
sqlrepo["SqlAlchemyUsageRepository"]
end
orch -->|complete| tracking
orch -->|anonymize / deanonymize| presidio
tracking -.->|record_call| sqlrepo
routes_usage["/v1/usage
route"] -->|get_usage_stats| sqlrepo
```
| Port | Adapter(s) | What it owns |
|---|---|---|
| {py:class}`~qfa.domain.ports.LLMPort` | {py:class}`~qfa.adapters.llm_client.LiteLLMClient`; optionally wrapped by {py:class}`~qfa.adapters.tracking_llm.TrackingLLMAdapter` when `DB_TRACK_USAGE=true` | One method, `complete(system_message, user_message, tenant_id, response_model, timeout)`. Returns `LLMResponse[T_Response]` carrying the structured output plus token counts and cost. |
| {py:class}`~qfa.domain.ports.AnonymizationPort` | {py:class}`~qfa.adapters.presidio_anonymizer.PresidioAnonymizer` | `anonymize(text) -> (text, mapping)` and `deanonymize(text, mapping) -> text`. The mapping is held in memory for the request lifetime, then discarded. |
| {py:class}`~qfa.domain.ports.UsageRepositoryPort` | {py:class}`~qfa.adapters.db.SqlAlchemyUsageRepository` | Writes one {py:class}`~qfa.domain.models.LLMCallRecord` per LLM call (from {py:class}`~qfa.adapters.tracking_llm.TrackingLLMAdapter`) and reads aggregate stats (from the `/v1/usage` routes). |
The tracking decorator is the only place hex's "stack adapters at the composition root" earns its keep — {py:class}`~qfa.adapters.tracking_llm.TrackingLLMAdapter` is itself an {py:class}`~qfa.domain.ports.LLMPort`, so the orchestrator never knows whether tracking is on.
## The orchestrator
{py:class}`~qfa.services.orchestrator.Orchestrator` is one class with four async methods, each backing one HTTP endpoint:
| Method | Endpoint | What it does |
|---|---|---|
| `analyze` | `POST /v1/analyze` | One LLM call. Free-text summary of themes across submitted records. |
| `summarize` | `POST /v1/summarize` | One LLM call. Per-record summaries with a self-evaluated quality score. |
| `summarize_aggregate` | `POST /v1/summarize-aggregate` | Two LLM calls (summary + judge). Single aggregate summary with a calibrated score. |
| `assign_codes` | `POST /v1/assign_codes` | Multiple LLM calls per record: pick + judge at each level of a hierarchical coding framework. |
All four enter `call_scope(tenant_id, operation)` first — see [Cross-cutting concerns](04-crosscutting.md) for what that does.
## Composition root
`qfa.api.app.create_app()` builds the FastAPI instance; the `lifespan` context manager wires the dependency graph at startup. The wiring is roughly:
1. Load settings.
2. Construct the base {py:class}`~qfa.domain.ports.LLMPort` (a {py:class}`~qfa.adapters.llm_client.LiteLLMClient`).
3. If `DB_TRACK_USAGE` is on:
- Construct the {py:class}`~qfa.domain.ports.UsageRepositoryPort` (a {py:class}`~qfa.adapters.db.SqlAlchemyUsageRepository`).
- Wrap the base {py:class}`~qfa.domain.ports.LLMPort` in a {py:class}`~qfa.adapters.tracking_llm.TrackingLLMAdapter` that delegates to the inner port and records each call to the repository.
4. Construct the {py:class}`~qfa.domain.ports.AnonymizationPort` (a {py:class}`~qfa.adapters.presidio_anonymizer.PresidioAnonymizer`).
5. Construct the {py:class}`~qfa.services.orchestrator.Orchestrator` with the (possibly wrapped) ports.
6. Stash the orchestrator, API keys, and — when present — the usage repository on `app.state` for the request lifecycle to read.
This is the **only** place that knows about concrete adapter classes. Routes and dependencies read from `app.state` only.
## Test seam
`create_app(llm_factory=…)` lets end-to-end tests inject a `FakeLLMPort` without monkey-patching. The lifespan still runs — so the *real* {py:class}`~qfa.adapters.tracking_llm.TrackingLLMAdapter`, {py:class}`~qfa.adapters.presidio_anonymizer.PresidioAnonymizer`, and migrations all execute. Only the bottom-most layer (the actual LLM call) is faked. See `tests/e2e/conftest.py`.