Configuration

Layout & Overrides

oxicrab deserializes exactly one canonical Config schema. TOML layers are merged first, then validated once, so defaults and validation rules stay centralized in Rust rather than drifting across multiple loaders.

File Layout

config.toml → config.local.toml → config.d/*.toml → Env / Helper / Keyring

Merge precedence is left to right. Later TOML layers override earlier ones. After the merged file stack is deserialized, credential sources still resolve in the usual order: environment variables, then credential helper, then OS keyring, then TOML config.

Recommended Pattern

config.toml: committed or primary machine-local settings
config.local.toml: secrets and one-off local overrides
config.d/10-*.toml, 20-*.toml: environment or role-specific overlays with deterministic lexical ordering

Example

# ~/.oxicrab/config.toml
[agents.defaults.modelRouting]
default = "anthropic/claude-sonnet-4-5-20250929"

[gateway]
host = "127.0.0.1"

# ~/.oxicrab/config.local.toml
[providers.anthropic]
apiKey = "sk-ant-..."

# ~/.oxicrab/config.d/20-office.toml
[gateway]
host = "10.0.0.25"

The effective config uses the model from config.toml, the API key from config.local.toml, and the gateway host from config.d/20-office.toml.

Model Routing

oxicrab uses a 2-tier resolution strategy to determine which LLM provider handles a model:

Prefix notation — groq/llama-3.1-70b (provider before /, stripped before API call)
Auto-detection — Known model name prefixes: claude-* → Anthropic, gpt-*/o1/o3/o4 → OpenAI, gemini-* → Gemini, deepseek-* → DeepSeek

Routing Resolution Table

Model value	Resolved provider	API model sent
`claude-sonnet-4-5-20250929`	anthropic (auto)	`claude-sonnet-4-5-20250929`
`anthropic/claude-opus-4-6`	anthropic (prefix)	`claude-opus-4-6`
`groq/llama-3.1-70b`	groq (prefix)	`llama-3.1-70b`
`ollama/qwen3-coder:30b`	ollama (prefix)	`qwen3-coder:30b`
`ollama/meta-llama/Llama-3.3-70B`	ollama (prefix)	`meta-llama/Llama-3.3-70B`
`deepseek-chat`	deepseek (auto)	`deepseek-chat`

API Key Models

Set the model and provide the corresponding API key. The provider is auto-detected from the model name:

[agents.defaults.modelRouting]
default = "claude-sonnet-4-5-20250929"

[providers.anthropic]
apiKey = "sk-ant-api03-..."

Available API key models:

claude-sonnet-4-5-20250929 (Anthropic) — recommended, best balance
claude-haiku-4-5-20251001 (Anthropic) — fastest
claude-opus-4-5-20251101 (Anthropic) — most capable
gpt-4, gpt-3.5-turbo (OpenAI)
gemini-pro (Google)

Prompt caching: Anthropic providers automatically use cache_control on system messages and tool definitions, enabling up to 90% input token cost reduction for repeated content (5-minute TTL).

Supported Providers

The following providers are supported. Use prefix notation (e.g. groq/llama-3.1-70b) to route to them:

[agents.defaults.modelRouting]
default = "deepseek/deepseek-chat"

[providers.deepseek]
apiKey = "sk-..."

Provider	Default Base URL	API Key Required
Anthropic	https://api.anthropic.com/v1/messages	Yes (or OAuth)
OpenAI	https://api.openai.com/v1/chat/completions	Yes
Gemini	https://generativelanguage.googleapis.com/v1	Yes
OpenRouter	https://openrouter.ai/api/v1/chat/completions	Yes
DeepSeek	https://api.deepseek.com/v1/chat/completions	Yes
Groq	https://api.groq.com/openai/v1/chat/completions	Yes
Moonshot	https://api.moonshot.cn/v1/chat/completions	Yes
Zhipu	https://open.bigmodel.cn/api/paas/v4/chat/completions	Yes
DashScope	https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions	Yes
MiniMax	https://api.minimaxi.chat/v1/text/chatcompletion_v2	Yes
vLLM	http://localhost:8000/v1/chat/completions	No
Ollama	http://localhost:11434/v1/chat/completions	No

Local providers (Ollama and vLLM) do not require an API key. Use the provider/model prefix format to route to them — the prefix is stripped before sending to the API (e.g. ollama/qwen3-coder:30b sends qwen3-coder:30b to the Ollama API).

Prompt-Guided Tool Calling

Local models often ignore native JSON tool schemas, responding with plain text instead of structured tool calls. Enable promptGuidedTools on the provider to work around this:

[providers.ollama]
promptGuidedTools = true

When enabled, the PromptGuidedToolsProvider wrapper:

Injects tool definitions as structured text into the system prompt
Strips native tools/tool_choice from the API request
Parses <tool_call> XML blocks from the model's text response into tool calls
Rewrites conversation history so tool results work without native tool support

This works with both direct local models and modelRouting.fallbacks for fallback chains. Currently supported for Ollama and vLLM providers.

To override the default endpoint, set apiBase on the provider:

[providers.vllm]
apiKey = "token-abc123"
apiBase = "http://my-server:8080/v1/chat/completions"

Custom Headers

Inject custom HTTP headers into every request for an OpenAI-compatible provider. Useful for authentication proxies, routing, or API gateways:

[providers.openrouter]
apiKey = "sk-..."

[providers.openrouter.headers]
X-Custom-Header = "value"
HTTP-Referer = "https://myapp.example.com"

Headers are merged into every chat and warmup request alongside the standard Authorization and Content-Type headers. Reserved header names (Authorization, Content-Type, x-api-key, and other internal headers) are blocked and will cause a config validation error.

Per-Provider Temperature

Override the global agents.defaults.temperature for a specific provider. Useful for providers like Moonshot/Kimi that enforce their own temperature constraints:

[providers.moonshot]
apiKey = "sk-..."
temperature = 0.6

Resolution order: per-provider → global → omit (API default). In TOML, leave the field unset to omit it from the API payload and let the provider use its own default.

OAuth Models

Anthropic OAuth is attempted when the provider resolves to anthropic — via anthropic/ prefix or auto-detection of claude-* models:

anthropic/claude-opus-4-5
anthropic/claude-opus-4-6

For OAuth models, either install Claude CLI (auto-detected) or configure credentials manually:

[providers.anthropicOAuth]
enabled = true
autoDetect = true
credentialsPath = "~/.anthropic/credentials.json"

Credentials

Oxicrab supports multiple credential backends. Each layer only fills fields that are still empty, so higher-priority sources always win.

Resolution Order

Environment variables → Credential helper → OS keyring → config.toml

Use oxicrab credentials list to see where each credential comes from.

OS Keyring

Credentials can be stored in the OS keychain (macOS Keychain, GNOME Keyring, Windows Credential Manager) via the keyring-store feature (default-on). Containers should use OXICRAB_* env vars instead. See CLI Reference → credentials for keyring management commands.

Credential Helper

External programs (1Password, Bitwarden, custom scripts) can supply credentials at startup. Configure under the top-level credentialHelper key.

1Password (desktop)

[credentialHelper]
command = "op"
args = ["item", "get", "oxicrab", "--format", "json"]
format = "1password"

1Password (CI / containers)

[credentialHelper]
command = "op"
args = ["read", "-"]
format = "json"

Bitwarden

[credentialHelper]
command = "bw"
args = ["get", "item", "oxicrab"]
format = "bitwarden"

Custom script

[credentialHelper]
command = "/path/to/my-script.sh"
args = []
format = "json"

Format	Description
1password	Parses 1Password `op item get` JSON output (fields array)
bitwarden	Parses Bitwarden `bw get item` JSON output (fields array)
json	Expects a flat JSON object: `{"anthropic-api-key": "sk-...", ...}`
line	Expects `key=value` pairs, one per line

Environment Variables

All 29 credential slots can be set via environment variables. Recommended for containers and CI.

Providers

Variable	Config Field
OXICRAB_ANTHROPIC_API_KEY	providers.anthropic.apiKey
OXICRAB_OPENAI_API_KEY	providers.openai.apiKey
OXICRAB_OPENROUTER_API_KEY	providers.openrouter.apiKey
OXICRAB_GEMINI_API_KEY	providers.gemini.apiKey
OXICRAB_DEEPSEEK_API_KEY	providers.deepseek.apiKey
OXICRAB_GROQ_API_KEY	providers.groq.apiKey
OXICRAB_MOONSHOT_API_KEY	providers.moonshot.apiKey
OXICRAB_ZHIPU_API_KEY	providers.zhipu.apiKey
OXICRAB_DASHSCOPE_API_KEY	providers.dashscope.apiKey
OXICRAB_MINIMAX_API_KEY	providers.minimax.apiKey
OXICRAB_VLLM_API_KEY	providers.vllm.apiKey
OXICRAB_OLLAMA_API_KEY	providers.ollama.apiKey

OAuth

Variable	Config Field
OXICRAB_ANTHROPIC_OAUTH_ACCESS	providers.anthropicOAuth.accessToken
OXICRAB_ANTHROPIC_OAUTH_REFRESH	providers.anthropicOAuth.refreshToken

Channels

Variable	Config Field
OXICRAB_TELEGRAM_TOKEN	channels.telegram.token
OXICRAB_DISCORD_TOKEN	channels.discord.token
OXICRAB_SLACK_BOT_TOKEN	channels.slack.botToken
OXICRAB_SLACK_APP_TOKEN	channels.slack.appToken
OXICRAB_TWILIO_ACCOUNT_SID	channels.twilio.accountSid
OXICRAB_TWILIO_AUTH_TOKEN	channels.twilio.authToken

Tools

Variable	Config Field
OXICRAB_GITHUB_TOKEN	tools.github.token
OXICRAB_WEATHER_API_KEY	tools.weather.apiKey
OXICRAB_TODOIST_TOKEN	tools.todoist.token
OXICRAB_WEB_SEARCH_API_KEY	tools.webSearch.apiKey
OXICRAB_GOOGLE_CLIENT_SECRET	tools.google.clientSecret
OXICRAB_OBSIDIAN_API_KEY	tools.obsidian.apiKey
OXICRAB_MEDIA_RADARR_API_KEY	tools.media.radarr.apiKey
OXICRAB_MEDIA_SONARR_API_KEY	tools.media.sonarr.apiKey

Voice

Variable	Config Field
OXICRAB_TRANSCRIPTION_API_KEY	voice.transcription.apiKey

Agent Defaults

All agent defaults live under agents.defaults in config.toml. See Workspace Docs for how these settings map to on-disk files.

Top-level Fields

Field	Type	Default	Description
workspace	string	~/.oxicrab/workspace	Path to workspace directory (see Workspace Docs)
maxTokens	u32	8192	Max tokens per LLM response
temperature	f32?	0.7	LLM sampling temperature (0.0–2.0). Omit the field to let the provider use its default. Can be overridden per-provider via `providers.<name>.temperature`.
maxToolIterations	usize	20	Max agent loop iterations per turn
sessionTtlDays	u32	30	Days before inactive sessions are pruned
mediaTtlDays	u32	7	Days before cached media files are cleaned up
maxConcurrentSubagents	usize	5	Max simultaneous background subagents

Agent Loop Behavior

The agent loop includes several automatic behaviors that do not require configuration:

Pre-flight token estimation — before each LLM call, estimates tokens from message characters and reasoning content (chars / 4). If the estimate exceeds 80% of the compaction threshold, trims oldest messages. Prevents wasted API calls on oversized contexts.
Per-session message queuing — messages arriving during an active agent run are queued (max 10 per session) and coalesced into one turn when the current run completes. Coalescing merges content, media, metadata, and action dispatches from all queued messages. No acknowledgment is sent for queued messages.
Daily session rotation — sessions are automatically rotated at UTC day boundaries. Long-term context is preserved in the memory database via per-turn extraction.
Session reset — users can send “reset”, “clear history”, “new session”, or “start over” to immediately clear their session.

Workspace TTL

Config path: agents.defaults.workspaceTtl

Per-category time-to-live (in days) for workspace files tracked by the workspace manager. Files older than their category's TTL are removed during the hygiene cycle. Omit a category to make it non-expiring.

Field	Type	Default	Description
temp	u64?	7	Temporary files
downloads	u64?	30	Downloaded files
images	u64?	90	Image files
code	u64?	omitted	Code files (never expire by default)
documents	u64?	omitted	Document files (never expire by default)
data	u64?	omitted	Data files (never expire by default)

Compaction

Config path: agents.defaults.compaction

After compaction builds the final message list, orphaned tool messages are automatically cleaned up. Tool result messages (role="tool") whose tool_call_id has no matching assistant tool_use block are removed to prevent API errors with providers that enforce strict tool pairing (e.g. Anthropic).

Field	Type	Default	Description
enabled	bool	true	Enable automatic context compaction
thresholdTokens	u32	40000	Token count that triggers compaction
keepRecent	usize	10	Number of recent messages to preserve verbatim
extractionEnabled	bool	true	Extract facts to memory during compaction
model	string?	omitted	Override model for compaction (uses the default model when omitted)
preFlushEnabled	bool	false	Flush pending memory notes to disk before compaction runs, ensuring extracted facts survive context truncation

Memory

Config path: agents.defaults.memory

Field	Type	Default	Description
embeddingsEnabled	bool	true	Enable hybrid vector+keyword search via local ONNX embeddings
embeddingsModel	string	BAAI/bge-small-en-v1.5	Embedding model for vector search
hybridWeight	f32	0.5	Blend weight: 0.0 = keyword only, 1.0 = vector only
searchFusionStrategy	string	"weighted_score"	Fusion strategy for hybrid search: `"weighted_score"` (linear blend) or `"rrf"` (reciprocal rank fusion)
rrfK	u32	60	RRF smoothing constant (only used when fusion strategy is `"rrf"`)
embeddingCacheSize	usize	10000	LRU cache size for embedding query results
recencyHalfLifeDays	u32	90	Half-life in days for BM25 recency decay. Older entries get lower keyword search scores. 0 disables decay.

When embeddings are enabled, the system prompt context injection automatically uses hybrid search (combined keyword + vector similarity) instead of keyword-only search. Missing embeddings are back-filled automatically.

Model Routing

Config path: agents.defaults.modelRouting

Model routing controls which model handles each type of work. It has three concepts:

default — the base model used for all tasks unless overridden.
tasks — per-task model overrides. Background tasks (daemon, cron, compaction, subagent) take a simple model string. The chat task supports an object with complexity-based escalation.
fallbacks — ordered provider resilience chain. If the primary provider fails, try the next in order.

Basic Configuration

Route background tasks to a cheap model while keeping the default for conversations:

[agents.defaults.modelRouting]
default = "moonshot/kimi-k2.5"
fallbacks = ["anthropic/claude-sonnet-4-5-20250929"]

[agents.defaults.modelRouting.tasks]
daemon = "anthropic/claude-haiku-4-5-20251001"
cron = "anthropic/claude-haiku-4-5-20251001"
compaction = "anthropic/claude-haiku-4-5-20251001"
subagent = "anthropic/claude-sonnet-4-5-20250929"
web_summary = "anthropic/claude-haiku-4-5-20251001"

Field	Type	Default	Description
default	string	"claude-sonnet-4-5-20250929"	Base `provider/model` string used for all tasks unless overridden by `tasks`.
fallbacks	string[]	[]	Ordered fallback chain of `provider/model` strings. Tried in order when the primary provider fails.
tasks	object	{}	Per-task overrides. Keys: `daemon`, `cron`, `compaction`, `subagent`, `chat`, `web_summary` (used by `web_fetch_summary`). Values: model string or chat routing object (for `chat` only).

Task providers are resolved once at startup. Each unique model gets its own provider instance with connection pooling.

Chat Complexity Routing

Optional — off by default. When not configured, all messages use the default model.

The chat task supports complexity-based model escalation. Each inbound message is scored across 7 dimensions (sub-millisecond, no API calls) and routed to a more capable model when the message is complex. Simple messages use the default model; complex ones escalate.

[modelRouting]
default = "openrouter/google/gemini-3-flash"

[modelRouting.tasks]
daemon = "openrouter/google/gemini-3-flash"
cron = "openrouter/google/gemini-3-flash"

[modelRouting.tasks.chat.thresholds]
standard = 0.3
heavy = 0.65

[modelRouting.tasks.chat.models]
standard = "anthropic/claude-sonnet-4-5-20250929"
heavy = "anthropic/claude-opus-4-6"

With this config: “hi” uses Gemini Flash (below 0.3), “what is a mutex?” uses Claude Sonnet (between 0.3 and 0.65), and “analyze the architecture trade-offs step by step” uses Claude Opus (above 0.65). Cron and daemon always use Gemini Flash.

Scoring dimensions:

Dimension	Method	Default Weight
Message length	Sigmoid normalization (centered at 500 chars)	0.10
Reasoning keywords	Aho-Corasick scan for ~24 terms (“analyze”, “step by step”, “trade-off”, etc.); saturates at 3 hits	0.30
Technical vocabulary	Aho-Corasick scan for ~25 terms (“algorithm”, “API”, “middleware”, etc.); saturates at 5 hits	0.15
Question complexity	Regex: simple → 0.1, comparative → 0.5, analytical → 0.7, multi-part → 0.9	0.15
Code presence	Regex: code fences, inline code, code-like patterns	0.10
Instruction complexity	Regex: numbered lists, sequential markers, imperative verbs; saturates at 4 steps	0.15
Conversational simplicity	Greeting/filler detection. Negative weight — pushes score down.	−0.20

Score mapping:

score < thresholds.standard → use default model
standard ≤ score < thresholds.heavy → use models.standard
score ≥ thresholds.heavy → use models.heavy

Force overrides bypass scoring: 2+ reasoning keywords → heavy. Pure greeting/filler → default. Message >50KB → heavy.

Field	Type	Default	Description
thresholds.standard	float	0.3	Score at or above this → standard model
thresholds.heavy	float	0.65	Score at or above this → heavy model
models.standard	string	—	Model for medium-complexity messages (required)
models.heavy	string	—	Model for high-complexity messages (required)
weights.*	float	varies	Per-dimension scoring weights. Adjust to tune how strongly each dimension influences the final score.

Circuit Breaker

Wraps the LLM provider with a three-state circuit breaker that trips only on transient errors, preventing cascading failures during outages.

Config path: providers.circuitBreaker

[providers.circuitBreaker]
enabled = true
failureThreshold = 5
recoveryTimeoutSecs = 60
halfOpenProbes = 2

Field	Type	Default	Description
enabled	bool	false	Enable circuit breaker wrapping
failureThreshold	u32	5	Consecutive transient failures before opening
recoveryTimeoutSecs	u64	60	Seconds to wait in Open state before probing
halfOpenProbes	u32	2	Successful probes needed to close again

States

Closed — normal operation, requests pass through
Open — all requests rejected immediately (after failureThreshold consecutive transient failures)
HalfOpen — allows halfOpenProbes test requests after recoveryTimeoutSecs

Transient vs Non-Transient Errors

Transient (trip the breaker): HTTP 429, 5xx, timeout, connection refused/reset.

Non-transient (do not trip): auth errors, invalid API key, permission denied, context length exceeded.

Cognitive Routines

Escalating pressure signals that nudge the LLM to self-checkpoint its progress during long tool-heavy agent loop runs. Prevents loss of context when compaction discards older messages.

Config path: agents.defaults.cognitive

[agents.defaults.cognitive]
enabled = true
gentleThreshold = 12
firmThreshold = 20
urgentThreshold = 30
recentToolsWindow = 10

Field	Type	Default	Description
enabled	bool	false	Enable cognitive checkpoint pressure signals
gentleThreshold	u32	12	Tool calls before a gentle hint to summarize progress
firmThreshold	u32	20	Tool calls before a firm warning to write a checkpoint
urgentThreshold	u32	30	Tool calls before an urgent demand to stop and summarize
recentToolsWindow	usize	10	Rolling window size for tracking recent tool names

Pressure Levels

Gentle (hint) — suggests briefly noting progress
Firm (warning) — asks the LLM to pause and write a checkpoint
Urgent (STOP) — demands an immediate detailed progress summary

Each level fires only once per checkpoint cycle. Counters reset when a periodic checkpoint fires.

Exfiltration Guard

Hides network-outbound tools from the LLM when enabled, preventing prompt-injected data exfiltration. Tools that declare network_outbound: true in their capabilities are automatically filtered from tool definitions and blocked at dispatch time. Use allowTools to selectively re-enable specific network tools.

Config path: tools.exfiltrationGuard

[tools.exfiltrationGuard]
enabled = true
allowTools = ["web_search"]

Field	Type	Default	Description
enabled	bool	false	Enable exfiltration guard
allowTools	string[]	[]	Network-outbound tool names to allow even when guard is enabled

Prompt Guard

Regex-based prompt injection detection. Scans user messages before LLM processing and tool output after execution. Four pattern categories: role switching, instruction override, secret extraction, and jailbreak patterns.

Config path: agents.defaults.promptGuard

[agents.defaults.promptGuard]
enabled = true
action = "block"

Field	Type	Default	Description
enabled	bool	false	Enable prompt injection detection
action	string	"warn"	"warn" (log and continue) or "block" (reject the message)

Detection Categories

Role switching — attempts to change persona ("ignore previous instructions", "you are now...")
Instruction override — attempts to replace system prompts ("new instructions:", "override system...")
Secret extraction — attempts to extract system prompts ("show me your system prompt", "repeat your instructions")
Jailbreak — common jailbreak prefixes ("DAN mode", "developer mode", "jailbreak")

User messages are scanned with the configured action. Tool output is always warn-only (tool output may legitimately contain these phrases).

Operator Approval

Interactive approval workflow for mutating tool actions. When enabled, the bot pauses before executing a covered action, sends an approval request with Approve/Deny buttons, and waits for an operator response. If denied or timed out, the action is not executed and the LLM receives an error result.

Config path: agents.defaults.approval

[agents.defaults.approval]
enabled = true
channel = "slack:C0ABC123"
timeout = 300
actions = ["google_mail.send", "google_mail.reply"]

Field	Type	Default	Description
enabled	bool	false	Enable interactive operator approval for mutating actions
channel	string	""	Approval target. Format: `"channel:chatId"` (e.g., `"slack:C0ABC123"`). Empty = same conversation (self-approval)
timeout	integer	300	Seconds to wait for operator response before auto-deny. Minimum 10 seconds when approval is enabled.
actions	string[]	[]	Actions requiring approval (e.g., `["google_mail.send"]`). Empty = all mutating actions

Channel format

The channel field uses a "channel_type:chat_id" format, the same format used by cron targets:

"slack:C0ABC123" — Slack channel ID (not #name)
"discord:123456789" — Discord channel ID
"telegram:12345" — Telegram chat ID

Leave empty for self-approval (buttons appear in the user's own conversation). The bot must be a member of the configured channel.

Action matching

The actions list accepts two formats:

"tool_name.action" — matches a specific action on a specific tool (e.g. "google_mail.send")
"tool_name" — matches all mutating (non-read-only) actions on that tool (e.g. "github" covers create_issue, create_pr_review, trigger_workflow but not read-only actions like list_issues)

When actions is empty, all non-read-only actions across all tools require approval. This is the strictest setting. Single-purpose tools (like exec, write_file) automatically resolve their declared action name — use "exec.execute" or "write_file.write" in the actions list.

Behavior

Disabled (default) — no interactive approval. The existing hard-block on tools like Gmail, Calendar, and GitHub remains active as a safety net. A startup warning is logged for tools with mutating actions that have no approval gate at all.
Enabled, empty actions — interactive approval for all mutating actions. The hard-block is replaced with the interactive flow.
Enabled, explicit actions — interactive approval only for listed actions. Unlisted mutating actions fall through to the legacy hard-block.

Reflection on tool failures

Reflexion-style failure reflection. When a tool call returns an error, the agent loop optionally invokes a small LLM call that produces a hypothesis (one-sentence cause) and a retry strategy (one concrete instruction for the next attempt). The reflection is appended to the failed result content as a <reflection> block so the next iteration sees it explicitly, and persisted to the tool_reflections table for offline analysis.

Off by default. Bounded per-request and per-tool to keep cost predictable.

Config path: agents.defaults.reflection

[agents.defaults.reflection]
enabled = false
maxPerRequest = 2
maxPerTool = 1
temperature = 0.2
maxTokens = 200
persistToDb = true
allowedTools = []      # empty = all tools eligible
blockedTools = []      # subset to silence even when enabled

Field	Type	Default	Description
enabled	bool	false	Master switch. Off while gathering data; turn on once metrics show retries are succeeding.
maxPerRequest	integer	2	Maximum reflections produced within a single agent run. Hard cap on per-request cost.
maxPerTool	integer	1	Maximum reflections per `(tool, action)` pair within a single run. Prevents reflecting on the same broken call repeatedly.
temperature	float	0.2	Sampling temperature for the reflection call. Low for determinism.
maxTokens	integer	200	Maximum response tokens for the reflection call.
persistToDb	bool	true	Persist each reflection to the `tool_reflections` table. Disable in tests; leave on in production for offline analysis.
allowedTools	string[]	[]	When non-empty, reflection only fires for tools in this list. Useful for staged rollout.
blockedTools	string[]	[]	Tools that never get reflection even when enabled. Useful for chronic-failure tools.

CLI

Reflection stats are queryable via oxicrab stats reflections:

$ oxicrab stats reflections --days 7
Tool Reflections (last 7 days)
tool                   action          total     ok    err pending fail_rate
shell                  execute             3      2      1       0     33.3%
github                 list_prs            1      0      0       1       n/a

The summary highlights any (tool, action) with a failure rate ≥ 50% as a candidate for the blockedTools list.

Output format

The injected block looks like:

<reflection>
attempt: 1
hypothesis: file path was relative to the wrong directory
retry_strategy: pass the absolute path or chdir first
</reflection>

Metrics

oxicrab_reflection_triggered_total{tool, action} — reflections produced
oxicrab_reflection_llm_error_total{tool} — reflection LLM call failed

Safety

The original error string is redacted through the leak detector before being sent to the reflection model, and the model's hypothesis and retry-strategy are redacted again before being appended to the tool result and persisted. Both the prompt and the result share the existing safety perimeter.

Skills

Markdown-based knowledge files loaded into the system prompt. Each skill lives at ~/.oxicrab/workspace/skills/<name>/<name>.md with optional YAML frontmatter (name, description, hints). Skills are pre-scanned for prompt-injection and credential-exfiltration patterns before injection.

Embedding-indexed retrieval

The skills_index SQLite table stores a per-skill embedding keyed by file SHA256. SkillIndex::rebuild() re-embeds only changed files (the SHA mismatch triggers a re-index). SkillIndex::top_k_for_query() ranks indexed skills by cosine similarity against the embedded user query. Usage counters (use_count, last_used_ms) are bumped on each retrieval.

Proposing new skills

The propose_skill helper writes a candidate file to workspace/skills/staged/<name>.md. Staged skills are not loaded into the system prompt. promote_staged_skill moves a staged file into its active per-skill directory after re-running the safety scanner and verifying the staged path is a regular file (not a symlink).

Hygiene

Startup hygiene calls prune_unused_skill_index with a 30-day cutoff and min_uses = 1: any indexed skill that is ≥30 days old, has been used zero times, and has not been touched since creation is dropped from the index. Skills with any usage history are kept regardless of age. SkillIndex::rescan_active_skills re-runs the safety scanner against every active skill so files that were clean when promoted but match a newly-added pattern are surfaced via warn! + oxicrab_skill_hygiene_newly_blocked_total.

Skill names must match [A-Za-z0-9][A-Za-z0-9_-]{0,63}: alphanumeric plus _ and -, 1–64 characters, no leading _ or -, no path components.

Config path: agents.defaults.skills

[agents.defaults.skills]
indexingEnabled = true
autoRebuildOnStartup = true
maxSystemPromptSkills = 5
pruneUnusedDays = 30
embeddingModelId = ""

Field	Type	Default	Description
indexingEnabled	bool	true	Maintain the embedding-indexed `skills_index` table. When false, retrieval falls back to keyword/hint matching.
autoRebuildOnStartup	bool	true	Run `SkillIndex::rebuild` once at agent startup. Best-effort; spawned async so it doesn't block.
maxSystemPromptSkills	integer	5	Hard cap on skills retrieved per turn. Bounds system-prompt growth.
pruneUnusedDays	integer	30	Hygiene drops indexed skills older than this with `use_count = 0`.
embeddingModelId	string	""	Identifier for the embedding model. When changed, the next `rebuild` bulk-invalidates rows produced by a different model. Empty = use the value from `agents.defaults.memory.embeddingsModel`.

`skill_propose` tool

Action-based deferred tool exposing the propose/promote/reject helpers to the LLM. Discoverable via tool_search. Actions:

propose — stage a candidate skill body. Suggests Promote/Reject buttons in the response so an operator can act in one click.
list_staged — show pending proposals (read-only).
promote — activate a staged skill. Re-runs safety scanner; refuses symlinks. Mutating; covered by the operator approval workflow when configured. Triggers incremental indexing so the skill is retrievable on the next turn.
reject — discard a staged proposal. Mutating; approval-eligible.

Trajectory

Logs every dispatched tool_call, tool_result, and turn_end event into the trajectory_events SQLite table. Powers cross-session pattern detection and a daily compression pass that summarises old sessions into trajectory_summaries, dropping the raw events.

Off by default — one INSERT per tool call is small but not free. Turn it on once you want the data; the auto-suggest pipeline reads the same table to detect repeating workflows.

Configuration

Config path: agents.defaults.trajectory

[agents.defaults.trajectory]
enabled = false
compressAfterDays = 90

[agents.defaults.trajectory.autoSuggest]
enabled = false
minOccurrences = 5
minSequenceLength = 2
maxSequenceSteps = 8
useLlmBody = false

Field	Type	Default	Description
enabled	bool	false	Master switch. When off, no events are written and the auto-suggest pass is a no-op.
compressAfterDays	integer	90	Sessions where every event predates this cutoff are compressed by the daily maintenance task: a `trajectory_summaries` row replaces the raw events. Set to 0 to disable compression.
autoSuggest.enabled	bool	false	After each turn-end, scan for repeating cross-session tool sequences. The top uncovered candidate is staged at `workspace/skills/staged/auto_<name>.md` for operator review — never auto-promoted.
autoSuggest.minOccurrences	integer	5	A sequence must repeat at least this many times across distinct turns to qualify.
autoSuggest.minSequenceLength	integer	2	Minimum tool calls per turn for that turn to count toward the occurrence tally.
autoSuggest.maxSequenceSteps	integer	8	Cap on per-sequence step count. Larger sequences are truncated.
autoSuggest.useLlmBody	bool	false	When true, fire a small LLM call to write a purpose-specific skill body for staged candidates instead of the fixed template. Cost: ~1k tokens per staged candidate. Falls back to template on failure.

Coverage check

Before staging, the suggester reads the skills_index table and skips any candidate where every step's tool name appears in an existing skill's name or description. Conservative — a false-positive coverage just suppresses one candidate, never silently rewrites anything.

Skill auto-refine

After a turn that loaded a skill into context AND ran ≥ minToolCalls tool calls, fire a two-round LLM pass to decide whether the skill body could be tightened or expanded based on what just happened.

Round 1 returns a JSON assessment: {should_patch, confidence, reason}. Round 2 only fires when confidence ≥ confidenceThreshold and produces the new body. Patches are written atomically (temp + rename), audited via a {name}-CHANGELOG.md sidecar, and persisted to the skill_refinements SQLite table. The skill_refinement count powers a deterministic version (1.{N+1}.0).

Off by default — costs roughly two small LLM calls per qualifying turn, plus disk writes to your skill files.

Configuration

Config path: agents.defaults.skillRefine

[agents.defaults.skillRefine]
enabled = false
confidenceThreshold = 0.7
minToolCalls = 3
maxTokens = 800

Field	Type	Default	Description
enabled	bool	false	Master switch.
confidenceThreshold	float	0.7	Round 2 only fires when round 1's `confidence` is at or above this value.
minToolCalls	integer	3	Skip refinement unless the just-completed turn ran at least this many tool calls.
maxTokens	integer	800	Maximum response tokens for both rounds. Round 1 is internally capped at 400.

What gets patched

The candidate skill is the highest-priority hit from ContextBuilder::select_skills_for_query against the trigger user message — the same selection logic the system prompt uses, so a turn's "active skill" matches what the LLM actually saw. Recent CHANGELOG entries are injected into the round-1 prompt so the model skips re-patching gaps that earlier sessions already addressed.

Activity journal

Append-only NDJSON timeline at <workspace>/activity_journal.ndjson. Every user inbound and agent outbound is written as one JSON line with a UTC timestamp, session key, role (user/agent/system), and content. The query_activity tool is registered only when this is enabled.

The journal is never auto-rotated. Operators rotate or archive the file manually when it grows beyond their taste — deletion is safe; a fresh file is created on the next write.

Configuration

Config path: agents.defaults.activityJournal

[agents.defaults.activityJournal]
enabled = false
maxContentChars = 512
defaultWindowMinutes = 60
maxWindowMinutes = 1440

Field	Type	Default	Description
enabled	bool	false	Master switch. Appends one line per user inbound and one per agent outbound.
maxContentChars	integer	512	Truncate stored content (UTF-8 boundary respected) to this many chars. Truncated entries get a trailing `…`.
defaultWindowMinutes	integer	60	Default half-window when the agent doesn't pass `window_minutes`.
maxWindowMinutes	integer	1440	Hard cap on `window_minutes` accepted from the LLM. Bounds disk reads.

Daily maintenance

An always-on 24-hour ticker runs run_hygiene + cleanup_workspace_files regardless of these flags. When trajectory is enabled, it also performs trajectory compression on dormant sessions older than compressAfterDays. The first pass runs at startup; subsequent passes fire 24h apart.

LLM-as-Judge

Poison-resistant semantic gate for tool calls, adopted from IronClaw PR #2845. Fires after the operator approval workflow but before registry.execute: a small LLM looks at (tool_name, args, user_intent) and returns {verdict: allow|block, reason: …}. When the verdict is block, the tool call is rejected with the reason surfaced to the agent as a tool error so it can re-plan.

The judge sees only the tool name, the (credential-scrubbed) args, and the user's original message. It does NOT see the conversation history or prior tool results — including those would let an attacker poison the judge with the same injection that poisoned the agent.

Fail-open by default: timeouts, provider errors, and malformed JSON all default to allow. The judge is defense-in-depth, not the only gate — silent fail-open keeps a flaky sidecar from bricking the agent.

Configuration

Config path: agents.defaults.judge

[agents.defaults.judge]
enabled = false
maxTokens = 200
timeoutSeconds = 5
allowedTools = []
blockedTools = []

Field	Type	Default	Description
enabled	bool	false	Master switch. Costs one small LLM call per covered tool dispatch when on.
maxTokens	integer	200	Per-call response token cap. Bound the verdict length.
timeoutSeconds	integer	5	Hard ceiling on the judge LLM call. Beyond this, fail-open.
allowedTools	string[]	[]	When non-empty, only these tools get judged. Use to roll out per-tool. Mutually exclusive with `blockedTools` (allow wins).
blockedTools	string[]	[]	Tools that never get judged even when `enabled = true`. Useful for high-volume read-only tools where the cost outweighs the safety win.

Provider/model

The judge uses the main agent provider/model — there's no separate task override today. If you want a cheaper/dedicated judge model, request it. Cost is bounded by maxTokens + timeoutSeconds regardless.

Recommended rollout

Leave enabled = false until you have a baseline.
Turn on for high-risk tools first via allowedTools = ["exec", "write_file", "send_message"].
Watch oxicrab_judge_blocked_total (planned metric) and false-positive reports for a week.
Expand the allowlist as confidence grows.

The judge sits between the operator approval workflow and tool execution: exfiltration guard → MCP allowlist → operator approval → judge → param validation → registry.execute.

Memory promotion (recall-driven)

Memory entries that prove useful in retrieval should outlive the 180-day daily-note retention window. The promotion pass scans memory_search_hits over a lookback window for daily: entries that were retrieved frequently across distinct queries, then rewrites their source_key from daily:<date>... to knowledge:auto:<date>... — the knowledge: prefix is exempt from retention purge.

Adopted from openclaw's recordShortTermRecalls + short-term-promotion pattern. The data was already collected in memory_access_log; this just feeds it back. Runs during the daily maintenance ticker.

Configuration

Config path: agents.defaults.memory.promotion

[agents.defaults.memory.promotion]
enabled = false
minRecalls = 5
minUniqueQueries = 2
daysBack = 30

Field	Type	Default	Description
enabled	bool	false	Master switch. Off by default — operators should verify their `oxicrab stats search` output looks healthy first.
minRecalls	integer	5	An entry must have appeared in this many search results across the lookback window to qualify.
minUniqueQueries	integer	2	Across at least this many distinct queries — protects against one popular query dominating the signal.
daysBack	integer	30	Lookback window for the recall histogram.

LLM request timeout

Hard timeout on each LLM provider call. Without it, a hung provider holds the per-session processing lock indefinitely — the channel goes silent forever. Adopted from nanobot PR #3428.

Config path: agents.defaults.llmRequestTimeoutSeconds (top-level under [agents.defaults]). Default 300 (5 minutes). Set to 0 to disable — only do that when you're sure your provider will never hang or you're testing latency under stress.

[agents.defaults]
llmRequestTimeoutSeconds = 300

On timeout the loop synthesises an error result, releases the session lock, and the channel is unblocked. The next inbound message is processed normally.

Context Providers

External shell commands that inject dynamic content into the system prompt each turn. Each provider runs its command, caches the output with a TTL, and appends it under a # Dynamic Context header.

Config path: agents.defaults.contextProviders

[[agents.defaults.contextProviders]]
name = "Git Status"
command = "git"
args = ["status", "--short"]
enabled = true
timeout = 5
ttl = 60
requiresBins = ["git"]
requiresEnv = []

Field	Type	Default	Description
name	string	required	Section header in the system prompt
command	string	required	Executable to run
args	string[]	[]	Command arguments
enabled	bool	true	Enable or disable this provider
timeout	u64	5	Execution timeout in seconds
ttl	u64	300	Cache lifetime in seconds before re-executing
requiresBins	string[]	[]	Required binaries (skipped if any missing)
requiresEnv	string[]	[]	Required environment variables (skipped if any missing)

Providers that fail, time out, or have missing dependencies are silently skipped — they never block the agent loop.

Validation: Commands must be non-empty, contain no control characters, and not include path separators. This prevents injection through crafted command strings.

Gateway

Controls the HTTP gateway server used by oxicrab gateway. Config path: gateway

[gateway]
enabled = true
host = "127.0.0.1"
port = 18790
apiKey = "your-secret-api-key"

[gateway.webhooks.github]
enabled = true
secret = "your-hmac-secret"
template = "GitHub {{action}} on {{repository.full_name}}: {{body}}"
agentTurn = true

[[gateway.webhooks.github.targets]]
channel = "slack"
chatId = "C12345"

Field	Type	Default	Description
enabled	bool	true	Enable or disable the HTTP gateway server
host	string	127.0.0.1	Bind address for the gateway HTTP server
port	u16	18790	Port for the gateway HTTP server
apiKey	string	""	API key for authenticating `/api/chat` and A2A task endpoints. Requests must include `Authorization: Bearer <key>` or `X-API-Key: <key>`. Minimum 32 characters when host is non-loopback. When empty and host is non-loopback, a startup warning is emitted. Health, webhooks (HMAC), and A2A discovery are always public.
webhooks	object	{}	Named webhook receivers (see below)
a2a	object	{}	Agent-to-Agent protocol configuration (see below)
rateLimit	object	{}	Per-IP rate limiting configuration (see below)

HTTP API Endpoints

Endpoint	Method	Description
/api/chat	POST	Send a message and receive the agent's response. Body: `{"message": "...", "session_id": "..."}`
/api/health	GET	Health check. Returns `{"status": "ready"/"starting", "version": "..."}`
/api/status	GET	System status: models, tools, channels, tokens, cron, safety, gateway, memory. Auth-gated, rate-limited.
/status	GET	HTML status dashboard. Public, auto-refreshes every 60s. Fetches data from `/api/status`.
/api/webhook/{name}	POST	Receive a webhook from an external service (see webhook config below)
/.well-known/agent.json	GET	A2A AgentCard (when A2A enabled)
/a2a/tasks	POST	Submit an A2A task. Body: `{"message": "..."}`
/a2a/tasks/{id}	GET	Get A2A task status and result

Agent-to-Agent (A2A) Protocol

Google's A2A protocol for agent discovery and interoperability. When enabled, exposes an AgentCard at /.well-known/agent.json and a task lifecycle at /a2a/tasks.

[gateway.a2a]
enabled = true
agentName = "My Agent"
agentDescription = "A helpful AI assistant"

Field	Type	Default	Description
enabled	bool	false	Enable A2A protocol endpoints
agentName	string	""	Agent name in the AgentCard
agentDescription	string	""	Agent description in the AgentCard

Tasks are processed through the same agent loop as chat messages. The task lifecycle: submitted → working → completed (or failed). Poll GET /a2a/tasks/{id} to check status.

Rate Limiting

Config path: gateway.rateLimit

Per-IP rate limiting for all gateway endpoints. Uses a token bucket algorithm with configurable sustained rate and burst capacity.

[gateway.rateLimit]
enabled = true
requestsPerSecond = 10
burst = 20
trustProxy = true
trustedProxies = ["10.0.0.0/8", "192.168.0.0/16"]

Field	Type	Default	Description
enabled	bool	false	Enable per-IP rate limiting
requestsPerSecond	u32	10	Sustained request rate per IP
burst	u32	20	Maximum burst capacity per IP
trustProxy	bool	false	Trust `X-Forwarded-For` header for client IP extraction. Enable only when running behind a reverse proxy.
trustedProxies	string[]	[]	Exact IPs or CIDRs allowed to supply `X-Forwarded-For`. Required when `trustProxy` is enabled.

When a client exceeds the rate limit, the gateway returns HTTP 429 with a Retry-After header indicating when to retry.

Webhook Configuration

Each entry in webhooks creates a receiver at POST /api/webhook/{name}. Payloads are validated with HMAC-SHA256 signature verification (constant-time comparison).

Field	Type	Default	Description
enabled	bool	true	Enable or disable this webhook endpoint. Disabled webhooks return 404
secret	string	—	HMAC-SHA256 secret for signature validation. Minimum 32 characters.
template	string	{{body}}	Message template. Use `{{key}}` for JSON payload fields, `{{body}}` for raw body
targets	array	[]	Delivery targets: `[{"channel": "slack", "chatId": "C12345"}]`
agentTurn	bool	false	If true, routes through the agent loop before delivering to targets

Signature headers checked: X-Signature-256, X-Hub-Signature-256, X-Webhook-Signature. Supports sha256= prefix (GitHub-style). Max payload: 1 MB.

Set host to "0.0.0.0" to listen on all interfaces (required for Docker/container deployments). The Twilio channel uses this same gateway for its webhook listener.

Webhook Dispatch

Each webhook can optionally include a dispatch object for structured direct tool execution, bypassing the LLM entirely.

Field	Type	Default	Description
dispatch.tool	string	—	Tool name to execute directly
dispatch.paramsTemplate	object	—	JSON parameters template with `{{key}}` substitution from the webhook payload

When dispatch is present, the webhook payload is parsed and template variables are substituted into the params, then the tool is executed directly without LLM involvement. Mutually exclusive with agentTurn.

Observability

Controls runtime telemetry exporters. Config path: observability

[observability.metrics]
enabled = true
bind = "127.0.0.1:9901"

Field	Type	Default	Description
metrics.enabled	bool	false	Enable Prometheus metrics exporter
metrics.bind	string	127.0.0.1:9901	HTTP bind address for `/metrics` endpoint

When enabled, oxicrab installs a process-wide Prometheus recorder and serves metrics at http://{bind}/metrics. This includes router counters/histograms such as route decisions, policy drift, semantic confidence, and blocked tool attempts.

Router

Controls the message router that pre-classifies inbound messages before LLM involvement. Config path: router

[router]
prefix = "!"
semanticTopK = 3
semanticPrefilterK = 12
semanticThreshold = 0.5

[[router.rules]]
trigger = "weather"
tool = "weather"

[router.rules.params]
action = "forecast"
location = "$1"

[[router.rules]]
trigger = "remind"
tool = "cron"

[router.rules.params]
action = "add"
message = "$*"

Field	Type	Default	Description
prefix	string	"!"	Command prefix character for prefix commands. Avoids collision with Slack/Discord slash commands.
rules	array	[]	User-defined prefix commands. Each rule has `trigger` (command word), `tool` (tool name), `params` (JSON with `$1`/`$2`/`$*` substitution for positional arguments).
semanticTopK	usize	3	Maximum number of tools retained when semantic filtering is applied to unconstrained turns.
semanticPrefilterK	usize	12	Lexical prefilter candidate size before semantic reranking. Must be >= `semanticTopK`.
semanticThreshold	f32	0.5	Minimum semantic score required to keep a candidate tool. Values are clamped to [-1.0, 1.0].

The message router runs at the top of message processing and chooses deterministic dispatch first, constrained LLM second, and full LLM last. Prefix commands (e.g. !weather London) are dispatched directly to the named tool without an LLM call.

Each route emits a strict policy object used by execution: allowed_tools, blocked_tools, and reason. Tool execution enforces this policy unconditionally. For diagnostics, use !router_replay [n] (alias: !route_replay [n]) to view route decisions for recent turns in the current session.

Sandbox

Kernel-enforced filesystem and network restrictions applied to both shell commands (tools.exec.sandbox) and MCP server child processes (tools.mcp.servers.*.sandbox). On Linux, uses Landlock LSM. On macOS, uses Seatbelt (sandbox_init). Graceful no-op on unsupported platforms or older Linux kernels.

[tools.exec.sandbox]
enabled = true
additionalReadPaths = ["/opt/data"]
additionalWritePaths = ["/home/user/output"]
blockNetwork = true

Field	Type	Default	Description
enabled	bool	true	Enable kernel-level process sandboxing (Landlock on Linux, Seatbelt on macOS)
additionalReadPaths	string[]	[]	Extra paths to grant read-only access (beyond `/usr`, `/lib`, `/lib64`, `/bin`, `/sbin`, `/etc`)
additionalWritePaths	string[]	[]	Extra paths to grant read-write access (beyond workspace + `/tmp` + `/var/tmp`)
blockNetwork	bool	true	Block all outbound TCP connections from shell commands

Linux (Landlock): Default read-only: /usr, /lib, /lib64, /bin, /sbin, /etc. Default read-write: workspace dir, /tmp, /var/tmp. Degrades gracefully on older kernels via BestEffort mode.

macOS (Seatbelt): Same default paths plus macOS-specific system paths (/System, /Library, /opt/homebrew, /usr/local) for read-only, and symlink targets (/private/tmp, /private/var/folders) for read-write. Also grants process execution, Mach IPC, and signal operations required for child processes.

All other filesystem access and network connections are denied. Use oxicrab doctor to check sandbox availability on your system.

Channels

Per-channel configuration under channels.{name}. See Channel Setup for step-by-step guides. All channels share these common fields:

Field	Type	Default	Description
enabled	bool	false	Enable this channel
allowFrom	string[]	[]	Authorized sender IDs. Empty = deny-all. Use `["*"]` for open access.
allowGroups	string[]	[]	Restrict which groups/channels the bot responds in. Empty = deny-all. Use `["*"]` for open access. Non-empty = only listed group IDs.
dmPolicy	string	"allowlist"	DM access policy: `"allowlist"`, `"pairing"`, or `"open"`

dmPolicy

Controls what happens when an unrecognized sender messages the bot on a channel.

Value	Behavior
"allowlist"	Check `allowFrom` + pairing store. Silently drop unrecognized senders. This is the default and preserves the original behavior.
"pairing"	Check `allowFrom` + pairing store. If unknown, generate an 8-character pairing code and send it to the sender. The bot owner can then approve with `oxicrab pairing approve {code}`.
"open"	Allow all senders unconditionally. No access checks are performed.

See Channel Setup → Common patterns for a detailed walkthrough and access check flowchart.

Channel-specific fields

Each channel has additional required fields. See Channel Setup for the complete config blocks. Quick reference:

Channel	Required Fields
telegram	`token`. Optional: `mentionOnly` (boolean, default false) — only respond in groups when bot is @mentioned or replied to
discord	`token`. Optional: `mentionOnly` (boolean, default false) — only respond in guilds when bot is @mentioned
slack	`botToken`, `appToken`. Optional: `thinkingEmoji` (default "eyes"), `doneEmoji` (default "white_check_mark")
whatsapp	(none — scan QR on first run). In groups the bot only responds when it is mentioned (`mentioned_jid`) or quote-replied to; this is not configurable. Use a 1:1 chat (DM) for unconditional bot participation. Note: the bot uses your own WhatsApp identity, so mentions/replies aimed at you-the-human in a group will also wake it — see channel setup for the recommended fix.
twilio	`accountSid`, `authToken`, `phoneNumber`, `webhookPort`, `webhookPath`, `webhookUrl` (required). Optional: `webhookHost` (string, default "127.0.0.1") — interface to bind the webhook server; `allowGroups` (array, default []) — restrict to specific Conversation SIDs

Logging

Logging is controlled by the RUST_LOG environment variable. Oxicrab uses the tracing-subscriber format.

# Default: info level, noisy dependencies suppressed
./target/release/oxicrab gateway

# Debug logging
RUST_LOG=debug ./target/release/oxicrab gateway

# Custom filtering
RUST_LOG=info,whatsapp_rust=warn,oxicrab::channels=debug ./target/release/oxicrab gateway

Common filters:

RUST_LOG=debug — verbose, includes all oxicrab internals
RUST_LOG=info,whatsapp_rust=warn — suppress noisy WhatsApp crate logs
RUST_LOG=oxicrab::agent=debug — debug only the agent loop

Config Validation

oxicrab validates configuration at startup and rejects invalid settings with actionable error messages. These checks run after all config layers are merged.

Rule	Details
Webhook secrets	Must be at least 32 characters
Gateway API key	Must be at least 32 characters when host is non-loopback
Approval timeout	Must be at least 10 seconds when approval is enabled
Context provider commands	Must be non-empty, contain no control characters, and not include path separators
Provider custom headers	Reserved names are blocked: `Authorization`, `Content-Type`, `x-api-key`, and other internal headers
Twilio webhookUrl	Required for the Twilio channel to start
Shell allowedCommands	A startup warning is emitted when `allowedCommands` is empty (unrestricted shell access)
Tool name shadowing	Built-in tools cannot be overridden by MCP or runtime-registered tools

Resource Limits

Hard caps applied at the boundaries where attacker- or LLM-controlled input enters the process. None are tunable today — they exist to prevent OOM and runaway-payload conditions, not to be policy knobs.

Boundary	Cap
Inbound message (`MessageBus::publish_inbound`)	1 MB — longer messages are truncated
Gateway request body (chat, webhook, A2A)	1 MB via `DefaultBodyLimit`
Webhook payload	1 MB
HTTP response bodies (HTTP / web tools)	10 MB via `limited_body()`
HTML extracted by browser tool	500 KB
Browser screenshot height	10080 px clamp
Audio uploads (cloud transcription)	25 MB
Image generation base64 payload	30 MB pre-decode check
Context files (`USER.md`, `TOOLS.md`, `AGENTS.md`)	500 KB each
Skill file (`{name}.md`)	1 MB each
Context provider output	100 KB per call
Compaction summary	2000 chars (prevents unbounded growth across cycles)

Configuration

Layout & Overrides

File Layout

Recommended Pattern

Example

Model Routing

Routing Resolution Table

API Key Models

Supported Providers

Prompt-Guided Tool Calling

Custom Headers

Per-Provider Temperature

OAuth Models

Credentials

Resolution Order

OS Keyring

Credential Helper

1Password (desktop)

1Password (CI / containers)

Bitwarden

Custom script

Environment Variables

Providers

OAuth

Channels

Tools

Voice

Agent Defaults

Top-level Fields

Agent Loop Behavior

Workspace TTL

Compaction

Memory

Model Routing

Basic Configuration

Chat Complexity Routing

Circuit Breaker

States

Transient vs Non-Transient Errors

Cognitive Routines

Pressure Levels

Exfiltration Guard

Prompt Guard

Detection Categories

Operator Approval

Channel format

Action matching

Behavior

Reflection on tool failures

CLI

Output format

Metrics

Safety

Skills

Embedding-indexed retrieval

Proposing new skills

Hygiene

Configuration

skill_propose tool

Trajectory

Configuration

Coverage check

Skill auto-refine

Configuration

What gets patched

Activity journal

Configuration

Daily maintenance

LLM-as-Judge

Configuration

Provider/model

Recommended rollout

Memory promotion (recall-driven)

Configuration

LLM request timeout

Context Providers

Gateway

HTTP API Endpoints

Agent-to-Agent (A2A) Protocol

Rate Limiting

`skill_propose` tool