Skip to content

LLM providers

RCG uses an LLM in two places — the extraction step (raw rule text → canonical Rule) and the optional semantic judge (does this pair of rules conflict?). Both sit behind small protocols, so the model backend is pluggable.

RCG's LLM layer covers (a) cloud gateways — Amazon Bedrock, Azure AI Foundry, Google Vertex AI; (b) direct vendors — Anthropic, OpenAI; (c) aggregators — OpenRouter; plus the Gemini API (Google AI Studio), DeepSeek, Qwen, and any OpenAI-compatible endpoint (local vLLM/Ollama). All but Anthropic ride one OpenAI-compatible provider class.

Beyond Anthropic, RCG ships one OpenAI-compatible provider class (src/rcg/extractors/openai_provider.py) that drives any endpoint speaking the OpenAI Chat Completions API with function/tool calling: DeepSeek, Qwen (DashScope or a local server), OpenAI itself, OpenRouter, the Gemini API, Amazon Bedrock, Azure AI Foundry, Google Vertex AI, and local servers such as vLLM or Ollama. The endpoint is selected purely by base URL, model, and API key.

Install

The OpenAI-compatible providers live in an optional extra:

pip install 'rule-coherence-graph[openai]'

Anthropic and the offline mock provider need no extra.

Provider matrix

--provider Backend Default base URL Default model Key env (with fallback)
anthropic Anthropic Messages API (SDK default) claude-sonnet-4-6 ANTHROPIC_API_KEY
deepseek DeepSeek (OpenAI-compatible) https://api.deepseek.com deepseek-chat DEEPSEEK_API_KEYRCG_LLM_API_KEY
qwen Qwen via DashScope https://dashscope.aliyun.com/compatible-mode/v1 qwen-max DASHSCOPE_API_KEYRCG_LLM_API_KEY
openai OpenAI / any compatible endpoint SDK default or RCG_LLM_BASE_URL gpt-4o-mini OPENAI_API_KEYRCG_LLM_API_KEY
openrouter OpenRouter aggregator https://openrouter.ai/api/v1 anthropic/claude-sonnet-4 (overridable) OPENROUTER_API_KEYRCG_LLM_API_KEY
google Google Gemini API (AI Studio) https://generativelanguage.googleapis.com/v1beta/openai/ gemini-2.5-flash GEMINI_API_KEYGOOGLE_API_KEYRCG_LLM_API_KEY
bedrock Amazon Bedrock (OpenAI-compatible) https://bedrock-runtime.<region>.amazonaws.com/openai/v1 (region from RCG_LLM_REGION/AWS_REGION/AWS_DEFAULT_REGION, default us-east-1) openai.gpt-oss-120b-1:0 AWS_BEARER_TOKEN_BEDROCKRCG_LLM_API_KEY
azure Azure AI Foundry / Azure OpenAI <AZURE_OPENAI_ENDPOINT>/openai/v1 (v1 GA path; no api-version) model = deployment name (required, from RCG_LLM_MODEL; no default) AZURE_OPENAI_API_KEYRCG_LLM_API_KEY
vertex Google Vertex AI (OpenAI-compatible) https://<region>-aiplatform.googleapis.com/v1/projects/<project>/locations/<region>/endpoints/openapi (region from RCG_LLM_REGION/VERTEX_LOCATION, default us-central1; project from VERTEX_PROJECT/GOOGLE_CLOUD_PROJECT) required (from RCG_LLM_MODEL, e.g. google/gemini-2.5-flash; no default) GOOGLE_VERTEX_ACCESS_TOKENRCG_LLM_API_KEY (short-lived OAuth token)
mock Deterministic heuristics (offline) none
auto anthropic if ANTHROPIC_API_KEY is set, else mock

The model for any OpenAI-compatible preset can be overridden with RCG_LLM_MODEL.

Environment variables

Variable Purpose
RCG_LLM_BASE_URL Override the base URL for the generic openai provider (point it at a local vLLM/Ollama server). For bedrock, also overrides the region-derived URL (use it to pick the Mantle endpoint or another region/host).
RCG_LLM_MODEL Override the model id for any OpenAI-compatible provider/preset.
RCG_LLM_API_KEY Generic API-key fallback for every OpenAI-compatible preset (including bedrock).
RCG_LLM_REGION Region for the bedrock provider's base URL. Falls back to AWS_REGION, then AWS_DEFAULT_REGION, then us-east-1.
OPENAI_API_KEY Standard key the OpenAI SDK reads; used by the generic openai provider.
DEEPSEEK_API_KEY Key for the deepseek preset.
DASHSCOPE_API_KEY Key for the qwen preset.
AWS_BEARER_TOKEN_BEDROCK Amazon Bedrock API key used as the bearer token for the bedrock provider (not SigV4).
OPENROUTER_API_KEY Key for the openrouter aggregator preset.
GEMINI_API_KEY / GOOGLE_API_KEY Key for the google (Gemini API) preset; GEMINI_API_KEY is tried first, then GOOGLE_API_KEY.
AZURE_OPENAI_ENDPOINT Per-resource Azure endpoint (e.g. https://myres.openai.azure.com); RCG appends /openai/v1. Required for azure.
AZURE_OPENAI_API_KEY Azure OpenAI API key for the azure provider.
VERTEX_PROJECT / GOOGLE_CLOUD_PROJECT GCP project for the vertex provider's base URL (VERTEX_PROJECT first). Required for vertex.
VERTEX_LOCATION Region fallback for vertex after RCG_LLM_REGION; default us-central1.
GOOGLE_VERTEX_ACCESS_TOKEN Short-lived Google OAuth access token (gcloud auth print-access-token) used as the bearer for the vertex provider.

Key resolution order for each preset: explicit constructor argument → the preset-specific env var → generic RCG_LLM_API_KEY → (for the generic openai provider only) OPENAI_API_KEY. If no key is found, the CLI prints which env var to set and reminds you that --provider mock works offline.

Examples

# DeepSeek
export DEEPSEEK_API_KEY=sk-...
rcg check ./rules --provider deepseek

# Qwen (DashScope)
export DASHSCOPE_API_KEY=sk-...
rcg check ./rules --provider qwen

# OpenAI
export OPENAI_API_KEY=sk-...
rcg check ./rules --provider openai

# A local OpenAI-compatible server (Ollama, vLLM, ...)
export RCG_LLM_BASE_URL=http://localhost:11434/v1
export RCG_LLM_API_KEY=ollama          # most local servers accept any token
export RCG_LLM_MODEL=qwen2.5:7b
rcg check ./rules --provider openai

OpenRouter

OpenRouter aggregates many vendors behind one OpenAI-compatible API. Models use a vendor/model id:

export OPENROUTER_API_KEY=sk-or-...
rcg check ./rules --provider openrouter        # default model anthropic/claude-sonnet-4

# Pick any OpenRouter model with RCG_LLM_MODEL
export RCG_LLM_MODEL=openai/gpt-4o-mini
rcg check ./rules --provider openrouter

Google Gemini API (AI Studio)

The Gemini API exposes an OpenAI-compatible surface. The key is read from GEMINI_API_KEY, then GOOGLE_API_KEY:

export GEMINI_API_KEY=...
rcg check ./rules --provider google            # default model gemini-2.5-flash

Amazon Bedrock

Amazon Bedrock exposes an OpenAI-compatible Chat Completions endpoint, so RCG drives it through the same provider class. The base URL is derived from a region (RCG_LLM_REGIONAWS_REGIONAWS_DEFAULT_REGIONus-east-1):

export AWS_BEARER_TOKEN_BEDROCK=...        # your Bedrock API key
export RCG_LLM_REGION=us-west-2            # region where the model is enabled
export RCG_LLM_MODEL=openai.gpt-oss-120b-1:0
rcg check ./rules --provider bedrock

To use the newer "Mantle" endpoint (or any other host/region), set RCG_LLM_BASE_URL explicitly — it always wins over the region-derived URL:

export AWS_BEARER_TOKEN_BEDROCK=...
export RCG_LLM_BASE_URL=https://bedrock-mantle.us-west-2.api.aws/v1
rcg check ./rules --provider bedrock

Caveats:

  • Model availability is region-gated. The OpenAI-style gpt-oss models (openai.gpt-oss-120b-1:0, openai.gpt-oss-20b-1:0) launched in us-west-2. Pick a region and model id your account has enabled.
  • Auth via the OpenAI-SDK path uses an Amazon Bedrock API key as the bearer token, not SigV4. Set it in AWS_BEARER_TOKEN_BEDROCK (or the generic RCG_LLM_API_KEY).

Azure AI Foundry / Azure OpenAI

Azure uses the OpenAI SDK against a per-resource endpoint, and the "model" is the deployment name you created in your Azure resource. RCG targets the modern v1 GA path (<endpoint>/openai/v1), so no api-version query parameter is needed:

export AZURE_OPENAI_ENDPOINT=https://<res>.openai.azure.com   # trailing slash optional
export AZURE_OPENAI_API_KEY=...
export RCG_LLM_MODEL=<deployment-name>        # NOT a model id — your deployment name
rcg check ./rules --provider azure

Caveat: RCG_LLM_MODEL is the deployment name, not a model id (e.g. your deployment of gpt-4o), and it is required — there is no default. Set RCG_LLM_BASE_URL to override the computed <endpoint>/openai/v1 URL.

Google Vertex AI

Vertex AI exposes an OpenAI-compatible endpoint whose URL is built from your region and project. Auth is a short-lived Google OAuth access token, not a static API key:

export VERTEX_PROJECT=...                      # or GOOGLE_CLOUD_PROJECT
export RCG_LLM_REGION=us-central1              # or VERTEX_LOCATION; default us-central1
export GOOGLE_VERTEX_ACCESS_TOKEN=$(gcloud auth print-access-token)
export RCG_LLM_MODEL=google/gemini-2.5-flash  # required; no default
rcg check ./rules --provider vertex

Caveat (important): the access token from gcloud auth print-access-token is short-lived (~1 hour). Regenerate it for long runs, or it will expire mid-run. For production, use a service-account flow (e.g. a workload-identity or service-account credential that mints fresh tokens) and feed the resulting token into GOOGLE_VERTEX_ACCESS_TOKEN (or RCG_LLM_API_KEY). Set RCG_LLM_BASE_URL to override the computed endpoint URL.

The same provider names work for the semantic judge — run rcg check ./rules --semantic --provider deepseek and, when the preset's key is set, the judge uses the OpenAI-compatible endpoint too (otherwise it falls back to the offline mock judge). The benchmark accepts --judge deepseek|qwen|openai|openrouter|google|bedrock|azure|vertex as well.

Caveat: structured-output reliability

RCG relies on forced function/tool calling for structured extraction. Hosted endpoints (DeepSeek, Qwen, OpenAI, Amazon Bedrock's gpt-oss models) support this well. The provider also validates the returned arguments and retries once with an explicit nudge if the first response has no tool call or returns incomplete/unparseable JSON.

Reliability still varies by endpoint: very weak local models may fail to emit a clean tool call even after the retry, in which case extraction raises a clear error — prefer a stronger model for the extraction step.