zshot/cliDownload

Providers

The same connection string configures both the llm formatter (--llm-format-model) and the navigator (--navigator-type). It selects an LLM backend and the model that runs on it. This is a Pro-tier capability.

Connection string format

protocol://[key@][host][/base-path]?model=NAME[&options]
  • protocol picks the backend (see below).
  • key is the API key for hosted providers, sent as key@.
  • host overrides the default endpoint, for self-hosted or proxied APIs.
  • base-path is appended to the host as the endpoint prefix — needed for AI gateways whose route includes a path. Omit it to hit the host root.
  • model names the model with ?model=.
  • options are query parameters that tune the request.

A bare protocol name with no URL is also valid for a hosted provider: a name like anthropic uses the provider’s default model and reads the key from the environment — see Credentials from the environment. There is no default backend; the navigator and formatter must each name one.

For local inference, run Ollama and point zshot at it with the ollama protocol. zshot does not embed a model runtime of its own.

Providers

The provider set is identical for the formatter and the navigator. Default model is what a bare provider name resolves to; key env var is read when no key@ is given in the connection string (first non-empty wins).

ProtocolBackendKeyDefault modelKey env var
ollamaOllama, local or remotenonegemma4
anthropicAnthropic Clauderequiredclaude-haiku-4-5ANTHROPIC_API_KEY
openaiOpenAIrequiredgpt-5.5OPENAI_API_KEY
googleGoogle Geminirequiredgemini-3.5-flashGOOGLE_API_KEY, GEMINI_API_KEY
mistralMistralrequiredmistral-large-latestMISTRAL_API_KEY
deepseekDeepSeekrequireddeepseek-v4-flashDEEPSEEK_API_KEY
groqGroqrequiredllama-3.3-70b-versatileGROQ_API_KEY
huggingfaceHuggingFace InferencerequiredHF_TOKEN, HUGGINGFACE_API_KEY

Credentials from the environment

When the connection string omits the API key, zshot reads it from the provider’s conventional environment variable (the key env var column above). When the model is omitted too, it falls back to the provider’s default model. So with ANTHROPIC_API_KEY already exported, the navigator needs only:

zshot --navigate "Click the News link" --navigator-type anthropic https://example.com

A key or model written into the connection string always overrides the environment. HuggingFace has no default model, so it still needs ?model=.

Query options

OptionApplies toEffect
modelhostedModel name. Required unless a bare provider name supplies a default.
headeropenai, anthropicCustom request header as Name:Value. Repeatable. Sent verbatim, so treat the value as a secret. Unsupported on the other backends (errors).
toolanthropic, openaiEnable a provider server-side tool. Repeatable. Unsupported on the other backends (errors).
max_tokensallCap on the completion length.
temperatureallSampling temperature. Omitted unless set, so the provider’s own default applies.
top_pallNucleus sampling cutoff. Omitted unless set, so the provider’s own default applies.
visionallForce vision input on or off.
insecurehostedUse http instead of https for a custom host.

header attaches extra HTTP request headers, split on the first colon into name and value, repeatable for multiple headers. Only the openai (and OpenAI-compatible) and anthropic backends carry them; the others reject a header= rather than silently dropping it. Values are often secrets (gateway auth), so --llm-info reports header names only and never the values.

The value is URL-decoded like any query string: %20, +, and a literal space all become a space, and leading/trailing spaces are trimmed. So a value that needs a literal + — a base64 token, for instance — must percent-encode it as %2B, otherwise it decodes to a space and the token is corrupted.

tool enables a provider-executed tool — the model searches or runs code on the provider’s side and the result feeds its answer, with no extra round trips through zshot. The value takes three forms:

  • A name: tool=web_search, tool=web_fetch, or tool=code_execution. On anthropic these resolve to the best tool version for the model: Claude 4.6 and newer get the dynamic-filtering versions (e.g. web_search_20260209), older models get the versions supported everywhere. On openai, only tool=web_search is available and the model must be search-capable (e.g. gpt-5-search-api).
  • A versioned Anthropic type, passed through as written: tool=web_search_20250305 pins a version regardless of model.
  • A base64 JSON data URI carrying a full Anthropic tool block, for configured cases like max_uses or allowed_domains: tool=data:application/json;base64,eyJ0eXBlIjoid2ViX3NlYXJjaF8yMDI1MDMwNSIsIm5hbWUiOiJ3ZWJfc2VhcmNoIiwibWF4X3VzZXMiOjN9

Tools are repeatable: ?tool=web_search&tool=web_fetch. Anthropic web search is billed per search in addition to tokens, and tool turns make more model round trips, so responses take longer. Tool traffic originates from the provider’s infrastructure, outside zshot’s robots.txt and --navigator-allow-url controls.

zshot -t llm_json \
  --llm-format-model "anthropic://?model=claude-haiku-4-5&tool=web_search" \
  --llm-format-prompt "Find this company's current stock price." \
  --llm-format-expects '{"type":"object","properties":{"stock_price":{"type":"number"}},"required":["stock_price"]}' \
  https://example.com -f out.json

max_tokens caps the model’s completion length. When omitted, most backends send no cap and the provider applies its own high default. Anthropic is the exception: its Messages API requires the field, so zshot supplies a default of 4096 — generous enough that ordinary page summaries finish cleanly.1 Raise it with ?max_tokens=8192 for very long extractions.

Examples

A hosted Anthropic model:

zshot -t llm_json \
  --llm-format-model "anthropic://$ANTHROPIC_API_KEY@?model=claude-haiku-4-5" \
  --llm-format-prompt "Summarize this page." \
  https://example.com -f out.json

Through a Cloudflare AI Gateway, where the endpoint carries a path the model sits behind — the path is the base prefix, ?model= names the model, and a header= carries the gateway’s auth token:

zshot -t llm_json \
  --llm-format-model "openai://$OPENAI_API_KEY@gateway.ai.cloudflare.com/v1/$ACCOUNT/$GATEWAY/openai?model=gpt-5.5&header=cf-aig-authorization:Bearer%20$AIG_TOKEN" \
  --llm-format-prompt "Summarize this page." \
  https://example.com -f out.json

A local Ollama model driving the navigator:

zshot --navigate "Click the News link" \
  --navigator-type "ollama://localhost:11434?model=gemma4" \
  https://example.com

  1. When a completion does hit the cap, zshot reports “LLM output truncated at the N-token completion limit” and suggests a higher value, rather than a confusing parse error. ↩︎