qfa.adapters.llm_client#

LLM client adapter using LiteLLM for unified provider access.

Classes

LiteLLMClient(model, api_key, api_base, ...)

LLM adapter satisfying LLMPort via LiteLLM.

class qfa.adapters.llm_client.LiteLLMClient(model: str, api_key: str, api_base: str, api_version: str, chars_per_token: int, max_total_tokens: int)[source]#

Bases: LLMPort

LLM adapter satisfying LLMPort via LiteLLM.

Routes to any LLM provider based on the model string prefix (e.g. "azure/gpt-4", "azure_ai/mistral-large-2411"). Calculates per-call cost using LiteLLM’s built-in cost map or custom pricing registered via litellm.register_model().

Parameters:
  • model (str) – LiteLLM model identifier (e.g. "azure_ai/mistral-large-2411").

  • api_key (str) – API key for the provider.

  • api_base (str) – Base URL for the provider endpoint. Empty string if not needed.

  • api_version (str) – API version string. Empty string if not needed.

async complete(system_message: str, user_message: str, tenant_id: str, response_model: type[T_Response], timeout: float = 20.0) LLMResponse[source]#

Send a completion request via LiteLLM.

Parameters:
  • system_message (str) – The system-level instruction for the model.

  • user_message (str) – The user-level message to complete.

  • timeout (float) – Maximum time in seconds to wait for a response.

  • tenant_id (str) – Tenant identifier passed as user for audit trail.

Returns:

The model’s response including token usage and cost.

Return type:

LLMResponse

Raises: