Skip to Content

Models

Models define connections to LLM providers. Cloud providers (Anthropic, OpenAI, Gemini) have built-in model lists — just add your API key and all supported models are automatically available. Local providers (Ollama) use aliases to map HCL-friendly keys to model names.

Cloud Providers

model "anthropic" { provider = "anthropic" api_key = vars.anthropic_api_key } model "openai" { provider = "openai" api_key = vars.openai_api_key } model "gemini" { provider = "gemini" api_key = vars.gemini_api_key }

All supported models for each provider are available automatically — no need to list them.

Local Models (Ollama)

The ollama provider connects to any OpenAI-compatible local inference server. Use aliases to define which models are available and map HCL-safe keys to the actual model names.

model "local" { provider = "ollama" base_url = "http://localhost:11434/v1" aliases = { gemma4 = "gemma4" gemma4_26b = "gemma4:26b" nemotron = "nemotron-cascade-2:30b" } }

The alias key (left side) becomes the HCL reference name. The value (right side) is the exact model name sent to the server. This handles models with colons or hyphens that aren’t valid in HCL identifiers.

agent "researcher" { model = models.local.gemma4_26b # sends "gemma4:26b" to Ollama } agent "writer" { model = models.local.nemotron # sends "nemotron-cascade-2:30b" to Ollama }

The base_url should point to the OpenAI-compatible API endpoint. Common values:

ServerBase URL
Ollamahttp://localhost:11434/v1
vLLMhttp://localhost:8000/v1
llama.cpphttp://localhost:8080/v1
LM Studiohttp://localhost:1234/v1

Attributes

AttributeTypeRequiredDescription
providerstringyesProvider name: anthropic, openai, gemini, or ollama
api_keystringcloud providersAPI key (required for anthropic, openai, gemini)
base_urlstringollama onlyURL for the OpenAI-compatible API endpoint
aliasesmapollama onlyMap of HCL key → API model name
prompt_cachingboolnoEnable prompt caching (default: true)

Supported Models

See the full list of built-in models with pricing for every provider on the Supported Models page.

Referencing Models

Use models.<config_name>.<model_key> to reference a model:

agent "assistant" { model = models.anthropic.claude_sonnet_4 } agent "local_researcher" { model = models.local.gemma4_26b } mission "pipeline" { commander { model = models.anthropic.claude_sonnet_4 } }

Multiple Configs Per Provider

You can have multiple model configs for the same provider with different API keys:

model "anthropic_prod" { provider = "anthropic" api_key = vars.anthropic_prod_key } model "anthropic_dev" { provider = "anthropic" api_key = vars.anthropic_dev_key }

Custom Aliases for Cloud Providers

Cloud providers can also use aliases to add custom model name mappings or override the built-in ones:

model "anthropic" { provider = "anthropic" api_key = vars.anthropic_api_key aliases = { sonnet = "claude-sonnet-4-20250514" } } agent "assistant" { model = models.anthropic.sonnet # custom alias }

Pricing Overrides

Squadron includes built-in pricing for all supported models to estimate costs per turn. Override with custom pricing using pricing blocks:

model "anthropic" { provider = "anthropic" api_key = vars.anthropic_api_key pricing "claude_sonnet_4_6" { input = 2.50 # per 1M tokens output = 12.00 cache_read = 0.25 cache_write = 3.00 } }
AttributeTypeDescription
inputnumberCost per 1M input tokens (required)
outputnumberCost per 1M output tokens (required)
cache_readnumberCost per 1M cached input tokens (optional, default: 0)
cache_writenumberCost per 1M cache write tokens (optional, default: 0)

The pricing block label must match a model key. Costs are shown in the command center’s Costs tab.

Local Models and Cost Tracking

Local models (Ollama provider) have no built-in pricing since they run on your own hardware. Squadron still tracks token usage for every turn, so you can monitor how many tokens your local models consume even though the dollar cost is $0.

Last updated on