Sections

Layout & Overrides

oxicrab deserializes exactly one canonical Config schema. TOML layers are merged first, then validated once, so defaults and validation rules stay centralized in Rust rather than drifting across multiple loaders.

File Layout

config.toml config.local.toml config.d/*.toml Env / Helper / Keyring

Merge precedence is left to right. Later TOML layers override earlier ones. After the merged file stack is deserialized, credential sources still resolve in the usual order: environment variables, then credential helper, then OS keyring, then TOML config.

Recommended Pattern

Example

# ~/.oxicrab/config.toml
[agents.defaults.modelRouting]
default = "anthropic/claude-sonnet-4-5-20250929"

[gateway]
host = "127.0.0.1"

# ~/.oxicrab/config.local.toml
[providers.anthropic]
apiKey = "sk-ant-..."

# ~/.oxicrab/config.d/20-office.toml
[gateway]
host = "10.0.0.25"

The effective config uses the model from config.toml, the API key from config.local.toml, and the gateway host from config.d/20-office.toml.

Model Routing

oxicrab uses a 2-tier resolution strategy to determine which LLM provider handles a model:

  1. Prefix notationgroq/llama-3.1-70b (provider before /, stripped before API call)
  2. Auto-detection — Known model name prefixes: claude-* → Anthropic, gpt-*/o1/o3/o4 → OpenAI, gemini-* → Gemini, deepseek-* → DeepSeek

Routing Resolution Table

Model valueResolved providerAPI model sent
claude-sonnet-4-5-20250929anthropic (auto)claude-sonnet-4-5-20250929
anthropic/claude-opus-4-6anthropic (prefix)claude-opus-4-6
groq/llama-3.1-70bgroq (prefix)llama-3.1-70b
ollama/qwen3-coder:30bollama (prefix)qwen3-coder:30b
ollama/meta-llama/Llama-3.3-70Bollama (prefix)meta-llama/Llama-3.3-70B
deepseek-chatdeepseek (auto)deepseek-chat

API Key Models

Set the model and provide the corresponding API key. The provider is auto-detected from the model name:

[agents.defaults.modelRouting]
default = "claude-sonnet-4-5-20250929"

[providers.anthropic]
apiKey = "sk-ant-api03-..."

Available API key models:

Prompt caching: Anthropic providers automatically use cache_control on system messages and tool definitions, enabling up to 90% input token cost reduction for repeated content (5-minute TTL).

Supported Providers

The following providers are supported. Use prefix notation (e.g. groq/llama-3.1-70b) to route to them:

[agents.defaults.modelRouting]
default = "deepseek/deepseek-chat"

[providers.deepseek]
apiKey = "sk-..."
ProviderDefault Base URLAPI Key Required
Anthropichttps://api.anthropic.com/v1/messagesYes (or OAuth)
OpenAIhttps://api.openai.com/v1/chat/completionsYes
Geminihttps://generativelanguage.googleapis.com/v1Yes
OpenRouterhttps://openrouter.ai/api/v1/chat/completionsYes
DeepSeekhttps://api.deepseek.com/v1/chat/completionsYes
Groqhttps://api.groq.com/openai/v1/chat/completionsYes
Moonshothttps://api.moonshot.cn/v1/chat/completionsYes
Zhipuhttps://open.bigmodel.cn/api/paas/v4/chat/completionsYes
DashScopehttps://dashscope.aliyuncs.com/compatible-mode/v1/chat/completionsYes
MiniMaxhttps://api.minimaxi.chat/v1/text/chatcompletion_v2Yes
vLLMhttp://localhost:8000/v1/chat/completionsNo
Ollamahttp://localhost:11434/v1/chat/completionsNo

Local providers (Ollama and vLLM) do not require an API key. Use the provider/model prefix format to route to them — the prefix is stripped before sending to the API (e.g. ollama/qwen3-coder:30b sends qwen3-coder:30b to the Ollama API).

Prompt-Guided Tool Calling

Local models often ignore native JSON tool schemas, responding with plain text instead of structured tool calls. Enable promptGuidedTools on the provider to work around this:

[providers.ollama]
promptGuidedTools = true

When enabled, the PromptGuidedToolsProvider wrapper:

This works with both direct local models and modelRouting.fallbacks for fallback chains. Currently supported for Ollama and vLLM providers.

To override the default endpoint, set apiBase on the provider:

[providers.vllm]
apiKey = "token-abc123"
apiBase = "http://my-server:8080/v1/chat/completions"

Custom Headers

Inject custom HTTP headers into every request for an OpenAI-compatible provider. Useful for authentication proxies, routing, or API gateways:

[providers.openrouter]
apiKey = "sk-..."

[providers.openrouter.headers]
X-Custom-Header = "value"
HTTP-Referer = "https://myapp.example.com"

Headers are merged into every chat and warmup request alongside the standard Authorization and Content-Type headers. Reserved header names (Authorization, Content-Type, x-api-key, and other internal headers) are blocked and will cause a config validation error.

Per-Provider Temperature

Override the global agents.defaults.temperature for a specific provider. Useful for providers like Moonshot/Kimi that enforce their own temperature constraints:

[providers.moonshot]
apiKey = "sk-..."
temperature = 0.6

Resolution order: per-providerglobalomit (API default). In TOML, leave the field unset to omit it from the API payload and let the provider use its own default.

OAuth Models

Anthropic OAuth is attempted when the provider resolves to anthropic — via anthropic/ prefix or auto-detection of claude-* models:

For OAuth models, either install Claude CLI (auto-detected) or configure credentials manually:

[providers.anthropicOAuth]
enabled = true
autoDetect = true
credentialsPath = "~/.anthropic/credentials.json"

Credentials

Oxicrab supports multiple credential backends. Each layer only fills fields that are still empty, so higher-priority sources always win.

Resolution Order

Environment variables Credential helper OS keyring config.toml

Use oxicrab credentials list to see where each credential comes from.

OS Keyring

Credentials can be stored in the OS keychain (macOS Keychain, GNOME Keyring, Windows Credential Manager) via the keyring-store feature (default-on). Containers should use OXICRAB_* env vars instead. See CLI Reference → credentials for keyring management commands.

Credential Helper

External programs (1Password, Bitwarden, custom scripts) can supply credentials at startup. Configure under the top-level credentialHelper key.

1Password (desktop)

[credentialHelper]
command = "op"
args = ["item", "get", "oxicrab", "--format", "json"]
format = "1password"

1Password (CI / containers)

[credentialHelper]
command = "op"
args = ["read", "-"]
format = "json"

Bitwarden

[credentialHelper]
command = "bw"
args = ["get", "item", "oxicrab"]
format = "bitwarden"

Custom script

[credentialHelper]
command = "/path/to/my-script.sh"
args = []
format = "json"
FormatDescription
1passwordParses 1Password op item get JSON output (fields array)
bitwardenParses Bitwarden bw get item JSON output (fields array)
jsonExpects a flat JSON object: {"anthropic-api-key": "sk-...", ...}
lineExpects key=value pairs, one per line

Environment Variables

All 29 credential slots can be set via environment variables. Recommended for containers and CI.

Providers

VariableConfig Field
OXICRAB_ANTHROPIC_API_KEYproviders.anthropic.apiKey
OXICRAB_OPENAI_API_KEYproviders.openai.apiKey
OXICRAB_OPENROUTER_API_KEYproviders.openrouter.apiKey
OXICRAB_GEMINI_API_KEYproviders.gemini.apiKey
OXICRAB_DEEPSEEK_API_KEYproviders.deepseek.apiKey
OXICRAB_GROQ_API_KEYproviders.groq.apiKey
OXICRAB_MOONSHOT_API_KEYproviders.moonshot.apiKey
OXICRAB_ZHIPU_API_KEYproviders.zhipu.apiKey
OXICRAB_DASHSCOPE_API_KEYproviders.dashscope.apiKey
OXICRAB_MINIMAX_API_KEYproviders.minimax.apiKey
OXICRAB_VLLM_API_KEYproviders.vllm.apiKey
OXICRAB_OLLAMA_API_KEYproviders.ollama.apiKey

OAuth

VariableConfig Field
OXICRAB_ANTHROPIC_OAUTH_ACCESSproviders.anthropicOAuth.accessToken
OXICRAB_ANTHROPIC_OAUTH_REFRESHproviders.anthropicOAuth.refreshToken

Channels

VariableConfig Field
OXICRAB_TELEGRAM_TOKENchannels.telegram.token
OXICRAB_DISCORD_TOKENchannels.discord.token
OXICRAB_SLACK_BOT_TOKENchannels.slack.botToken
OXICRAB_SLACK_APP_TOKENchannels.slack.appToken
OXICRAB_TWILIO_ACCOUNT_SIDchannels.twilio.accountSid
OXICRAB_TWILIO_AUTH_TOKENchannels.twilio.authToken

Tools

VariableConfig Field
OXICRAB_GITHUB_TOKENtools.github.token
OXICRAB_WEATHER_API_KEYtools.weather.apiKey
OXICRAB_TODOIST_TOKENtools.todoist.token
OXICRAB_WEB_SEARCH_API_KEYtools.webSearch.apiKey
OXICRAB_GOOGLE_CLIENT_SECRETtools.google.clientSecret
OXICRAB_OBSIDIAN_API_KEYtools.obsidian.apiKey
OXICRAB_MEDIA_RADARR_API_KEYtools.media.radarr.apiKey
OXICRAB_MEDIA_SONARR_API_KEYtools.media.sonarr.apiKey

Voice

VariableConfig Field
OXICRAB_TRANSCRIPTION_API_KEYvoice.transcription.apiKey

Agent Defaults

All agent defaults live under agents.defaults in config.toml. See Workspace Docs for how these settings map to on-disk files.

Top-level Fields

FieldTypeDefaultDescription
workspacestring~/.oxicrab/workspacePath to workspace directory (see Workspace Docs)
maxTokensu328192Max tokens per LLM response
temperaturef32?0.7LLM sampling temperature (0.0–2.0). Omit the field to let the provider use its default. Can be overridden per-provider via providers.<name>.temperature.
maxToolIterationsusize20Max agent loop iterations per turn
sessionTtlDaysu3230Days before inactive sessions are pruned
mediaTtlDaysu327Days before cached media files are cleaned up
maxConcurrentSubagentsusize5Max simultaneous background subagents

Agent Loop Behavior

The agent loop includes several automatic behaviors that do not require configuration:

Workspace TTL

Config path: agents.defaults.workspaceTtl

Per-category time-to-live (in days) for workspace files tracked by the workspace manager. Files older than their category's TTL are removed during the hygiene cycle. Omit a category to make it non-expiring.

FieldTypeDefaultDescription
tempu64?7Temporary files
downloadsu64?30Downloaded files
imagesu64?90Image files
codeu64?omittedCode files (never expire by default)
documentsu64?omittedDocument files (never expire by default)
datau64?omittedData files (never expire by default)

Compaction

Config path: agents.defaults.compaction

After compaction builds the final message list, orphaned tool messages are automatically cleaned up. Tool result messages (role="tool") whose tool_call_id has no matching assistant tool_use block are removed to prevent API errors with providers that enforce strict tool pairing (e.g. Anthropic).

FieldTypeDefaultDescription
enabledbooltrueEnable automatic context compaction
thresholdTokensu3240000Token count that triggers compaction
keepRecentusize10Number of recent messages to preserve verbatim
extractionEnabledbooltrueExtract facts to memory during compaction
modelstring?omittedOverride model for compaction (uses the default model when omitted)
preFlushEnabledboolfalseFlush pending memory notes to disk before compaction runs, ensuring extracted facts survive context truncation

Memory

Config path: agents.defaults.memory

FieldTypeDefaultDescription
embeddingsEnabledbooltrueEnable hybrid vector+keyword search via local ONNX embeddings
embeddingsModelstringBAAI/bge-small-en-v1.5Embedding model for vector search
hybridWeightf320.5Blend weight: 0.0 = keyword only, 1.0 = vector only
searchFusionStrategystring"weighted_score"Fusion strategy for hybrid search: "weighted_score" (linear blend) or "rrf" (reciprocal rank fusion)
rrfKu3260RRF smoothing constant (only used when fusion strategy is "rrf")
embeddingCacheSizeusize10000LRU cache size for embedding query results
recencyHalfLifeDaysu3290Half-life in days for BM25 recency decay. Older entries get lower keyword search scores. 0 disables decay.

When embeddings are enabled, the system prompt context injection automatically uses hybrid search (combined keyword + vector similarity) instead of keyword-only search. Missing embeddings are back-filled automatically.

Model Routing

Config path: agents.defaults.modelRouting

Model routing controls which model handles each type of work. It has three concepts:

Basic Configuration

Route background tasks to a cheap model while keeping the default for conversations:

[agents.defaults.modelRouting]
default = "moonshot/kimi-k2.5"
fallbacks = ["anthropic/claude-sonnet-4-5-20250929"]

[agents.defaults.modelRouting.tasks]
daemon = "anthropic/claude-haiku-4-5-20251001"
cron = "anthropic/claude-haiku-4-5-20251001"
compaction = "anthropic/claude-haiku-4-5-20251001"
subagent = "anthropic/claude-sonnet-4-5-20250929"
web_summary = "anthropic/claude-haiku-4-5-20251001"
FieldTypeDefaultDescription
defaultstring"claude-sonnet-4-5-20250929"Base provider/model string used for all tasks unless overridden by tasks.
fallbacksstring[][]Ordered fallback chain of provider/model strings. Tried in order when the primary provider fails.
tasksobject{}Per-task overrides. Keys: daemon, cron, compaction, subagent, chat, web_summary (used by web_fetch_summary). Values: model string or chat routing object (for chat only).

Task providers are resolved once at startup. Each unique model gets its own provider instance with connection pooling.

Chat Complexity Routing

Optional — off by default. When not configured, all messages use the default model.

The chat task supports complexity-based model escalation. Each inbound message is scored across 7 dimensions (sub-millisecond, no API calls) and routed to a more capable model when the message is complex. Simple messages use the default model; complex ones escalate.

[modelRouting]
default = "openrouter/google/gemini-3-flash"

[modelRouting.tasks]
daemon = "openrouter/google/gemini-3-flash"
cron = "openrouter/google/gemini-3-flash"

[modelRouting.tasks.chat.thresholds]
standard = 0.3
heavy = 0.65

[modelRouting.tasks.chat.models]
standard = "anthropic/claude-sonnet-4-5-20250929"
heavy = "anthropic/claude-opus-4-6"

With this config: “hi” uses Gemini Flash (below 0.3), “what is a mutex?” uses Claude Sonnet (between 0.3 and 0.65), and “analyze the architecture trade-offs step by step” uses Claude Opus (above 0.65). Cron and daemon always use Gemini Flash.

Scoring dimensions:

DimensionMethodDefault Weight
Message lengthSigmoid normalization (centered at 500 chars)0.10
Reasoning keywordsAho-Corasick scan for ~24 terms (“analyze”, “step by step”, “trade-off”, etc.); saturates at 3 hits0.30
Technical vocabularyAho-Corasick scan for ~25 terms (“algorithm”, “API”, “middleware”, etc.); saturates at 5 hits0.15
Question complexityRegex: simple → 0.1, comparative → 0.5, analytical → 0.7, multi-part → 0.90.15
Code presenceRegex: code fences, inline code, code-like patterns0.10
Instruction complexityRegex: numbered lists, sequential markers, imperative verbs; saturates at 4 steps0.15
Conversational simplicityGreeting/filler detection. Negative weight — pushes score down.−0.20

Score mapping:

Force overrides bypass scoring: 2+ reasoning keywords → heavy. Pure greeting/filler → default. Message >50KB → heavy.

FieldTypeDefaultDescription
thresholds.standardfloat0.3Score at or above this → standard model
thresholds.heavyfloat0.65Score at or above this → heavy model
models.standardstringModel for medium-complexity messages (required)
models.heavystringModel for high-complexity messages (required)
weights.*floatvariesPer-dimension scoring weights. Adjust to tune how strongly each dimension influences the final score.

Circuit Breaker

Wraps the LLM provider with a three-state circuit breaker that trips only on transient errors, preventing cascading failures during outages.

Config path: providers.circuitBreaker

[providers.circuitBreaker]
enabled = true
failureThreshold = 5
recoveryTimeoutSecs = 60
halfOpenProbes = 2
FieldTypeDefaultDescription
enabledboolfalseEnable circuit breaker wrapping
failureThresholdu325Consecutive transient failures before opening
recoveryTimeoutSecsu6460Seconds to wait in Open state before probing
halfOpenProbesu322Successful probes needed to close again

States

Transient vs Non-Transient Errors

Transient (trip the breaker): HTTP 429, 5xx, timeout, connection refused/reset.

Non-transient (do not trip): auth errors, invalid API key, permission denied, context length exceeded.

Cognitive Routines

Escalating pressure signals that nudge the LLM to self-checkpoint its progress during long tool-heavy agent loop runs. Prevents loss of context when compaction discards older messages.

Config path: agents.defaults.cognitive

[agents.defaults.cognitive]
enabled = true
gentleThreshold = 12
firmThreshold = 20
urgentThreshold = 30
recentToolsWindow = 10
FieldTypeDefaultDescription
enabledboolfalseEnable cognitive checkpoint pressure signals
gentleThresholdu3212Tool calls before a gentle hint to summarize progress
firmThresholdu3220Tool calls before a firm warning to write a checkpoint
urgentThresholdu3230Tool calls before an urgent demand to stop and summarize
recentToolsWindowusize10Rolling window size for tracking recent tool names

Pressure Levels

Each level fires only once per checkpoint cycle. Counters reset when a periodic checkpoint fires.

Exfiltration Guard

Hides network-outbound tools from the LLM when enabled, preventing prompt-injected data exfiltration. Tools that declare network_outbound: true in their capabilities are automatically filtered from tool definitions and blocked at dispatch time. Use allowTools to selectively re-enable specific network tools.

Config path: tools.exfiltrationGuard

[tools.exfiltrationGuard]
enabled = true
allowTools = ["web_search"]
FieldTypeDefaultDescription
enabledboolfalseEnable exfiltration guard
allowToolsstring[][]Network-outbound tool names to allow even when guard is enabled

Prompt Guard

Regex-based prompt injection detection. Scans user messages before LLM processing and tool output after execution. Four pattern categories: role switching, instruction override, secret extraction, and jailbreak patterns.

Config path: agents.defaults.promptGuard

[agents.defaults.promptGuard]
enabled = true
action = "block"
FieldTypeDefaultDescription
enabledboolfalseEnable prompt injection detection
actionstring"warn""warn" (log and continue) or "block" (reject the message)

Detection Categories

User messages are scanned with the configured action. Tool output is always warn-only (tool output may legitimately contain these phrases).

Operator Approval

Interactive approval workflow for mutating tool actions. When enabled, the bot pauses before executing a covered action, sends an approval request with Approve/Deny buttons, and waits for an operator response. If denied or timed out, the action is not executed and the LLM receives an error result.

Config path: agents.defaults.approval

[agents.defaults.approval]
enabled = true
channel = "slack:C0ABC123"
timeout = 300
actions = ["google_mail.send", "google_mail.reply"]
FieldTypeDefaultDescription
enabledboolfalseEnable interactive operator approval for mutating actions
channelstring""Approval target. Format: "channel:chatId" (e.g., "slack:C0ABC123"). Empty = same conversation (self-approval)
timeoutinteger300Seconds to wait for operator response before auto-deny. Minimum 10 seconds when approval is enabled.
actionsstring[][]Actions requiring approval (e.g., ["google_mail.send"]). Empty = all mutating actions

Channel format

The channel field uses a "channel_type:chat_id" format, the same format used by cron targets:

Leave empty for self-approval (buttons appear in the user's own conversation). The bot must be a member of the configured channel.

Action matching

The actions list accepts two formats:

When actions is empty, all non-read-only actions across all tools require approval. This is the strictest setting. Single-purpose tools (like exec, write_file) automatically resolve their declared action name — use "exec.execute" or "write_file.write" in the actions list.

Behavior

Reflection on tool failures

Reflexion-style failure reflection. When a tool call returns an error, the agent loop optionally invokes a small LLM call that produces a hypothesis (one-sentence cause) and a retry strategy (one concrete instruction for the next attempt). The reflection is appended to the failed result content as a <reflection> block so the next iteration sees it explicitly, and persisted to the tool_reflections table for offline analysis.

Off by default. Bounded per-request and per-tool to keep cost predictable.

Config path: agents.defaults.reflection

[agents.defaults.reflection]
enabled = false
maxPerRequest = 2
maxPerTool = 1
temperature = 0.2
maxTokens = 200
persistToDb = true
allowedTools = []      # empty = all tools eligible
blockedTools = []      # subset to silence even when enabled
FieldTypeDefaultDescription
enabledboolfalseMaster switch. Off while gathering data; turn on once metrics show retries are succeeding.
maxPerRequestinteger2Maximum reflections produced within a single agent run. Hard cap on per-request cost.
maxPerToolinteger1Maximum reflections per (tool, action) pair within a single run. Prevents reflecting on the same broken call repeatedly.
temperaturefloat0.2Sampling temperature for the reflection call. Low for determinism.
maxTokensinteger200Maximum response tokens for the reflection call.
persistToDbbooltruePersist each reflection to the tool_reflections table. Disable in tests; leave on in production for offline analysis.
allowedToolsstring[][]When non-empty, reflection only fires for tools in this list. Useful for staged rollout.
blockedToolsstring[][]Tools that never get reflection even when enabled. Useful for chronic-failure tools.

CLI

Reflection stats are queryable via oxicrab stats reflections:

$ oxicrab stats reflections --days 7
Tool Reflections (last 7 days)
tool                   action          total     ok    err pending fail_rate
shell                  execute             3      2      1       0     33.3%
github                 list_prs            1      0      0       1       n/a

The summary highlights any (tool, action) with a failure rate ≥ 50% as a candidate for the blockedTools list.

Output format

The injected block looks like:

<reflection>
attempt: 1
hypothesis: file path was relative to the wrong directory
retry_strategy: pass the absolute path or chdir first
</reflection>

Metrics

Safety

The original error string is redacted through the leak detector before being sent to the reflection model, and the model's hypothesis and retry-strategy are redacted again before being appended to the tool result and persisted. Both the prompt and the result share the existing safety perimeter.

Skills

Markdown-based knowledge files loaded into the system prompt. Each skill lives at ~/.oxicrab/workspace/skills/<name>/<name>.md with optional YAML frontmatter (name, description, hints). Skills are pre-scanned for prompt-injection and credential-exfiltration patterns before injection.

Embedding-indexed retrieval

The skills_index SQLite table stores a per-skill embedding keyed by file SHA256. SkillIndex::rebuild() re-embeds only changed files (the SHA mismatch triggers a re-index). SkillIndex::top_k_for_query() ranks indexed skills by cosine similarity against the embedded user query. Usage counters (use_count, last_used_ms) are bumped on each retrieval.

Proposing new skills

The propose_skill helper writes a candidate file to workspace/skills/staged/<name>.md. Staged skills are not loaded into the system prompt. promote_staged_skill moves a staged file into its active per-skill directory after re-running the safety scanner and verifying the staged path is a regular file (not a symlink).

Hygiene

Startup hygiene calls prune_unused_skill_index with a 30-day cutoff and min_uses = 1: any indexed skill that is ≥30 days old, has been used zero times, and has not been touched since creation is dropped from the index. Skills with any usage history are kept regardless of age. SkillIndex::rescan_active_skills re-runs the safety scanner against every active skill so files that were clean when promoted but match a newly-added pattern are surfaced via warn! + oxicrab_skill_hygiene_newly_blocked_total.

Skill names must match [A-Za-z0-9][A-Za-z0-9_-]{0,63}: alphanumeric plus _ and -, 1–64 characters, no leading _ or -, no path components.

Configuration

Config path: agents.defaults.skills

[agents.defaults.skills]
indexingEnabled = true
autoRebuildOnStartup = true
maxSystemPromptSkills = 5
pruneUnusedDays = 30
embeddingModelId = ""
FieldTypeDefaultDescription
indexingEnabledbooltrueMaintain the embedding-indexed skills_index table. When false, retrieval falls back to keyword/hint matching.
autoRebuildOnStartupbooltrueRun SkillIndex::rebuild once at agent startup. Best-effort; spawned async so it doesn't block.
maxSystemPromptSkillsinteger5Hard cap on skills retrieved per turn. Bounds system-prompt growth.
pruneUnusedDaysinteger30Hygiene drops indexed skills older than this with use_count = 0.
embeddingModelIdstring""Identifier for the embedding model. When changed, the next rebuild bulk-invalidates rows produced by a different model. Empty = use the value from agents.defaults.memory.embeddingsModel.

skill_propose tool

Action-based deferred tool exposing the propose/promote/reject helpers to the LLM. Discoverable via tool_search. Actions:

Trajectory

Logs every dispatched tool_call, tool_result, and turn_end event into the trajectory_events SQLite table. Powers cross-session pattern detection and a daily compression pass that summarises old sessions into trajectory_summaries, dropping the raw events.

Off by default — one INSERT per tool call is small but not free. Turn it on once you want the data; the auto-suggest pipeline reads the same table to detect repeating workflows.

Configuration

Config path: agents.defaults.trajectory

[agents.defaults.trajectory]
enabled = false
compressAfterDays = 90

[agents.defaults.trajectory.autoSuggest]
enabled = false
minOccurrences = 5
minSequenceLength = 2
maxSequenceSteps = 8
useLlmBody = false
FieldTypeDefaultDescription
enabledboolfalseMaster switch. When off, no events are written and the auto-suggest pass is a no-op.
compressAfterDaysinteger90Sessions where every event predates this cutoff are compressed by the daily maintenance task: a trajectory_summaries row replaces the raw events. Set to 0 to disable compression.
autoSuggest.enabledboolfalseAfter each turn-end, scan for repeating cross-session tool sequences. The top uncovered candidate is staged at workspace/skills/staged/auto_<name>.md for operator review — never auto-promoted.
autoSuggest.minOccurrencesinteger5A sequence must repeat at least this many times across distinct turns to qualify.
autoSuggest.minSequenceLengthinteger2Minimum tool calls per turn for that turn to count toward the occurrence tally.
autoSuggest.maxSequenceStepsinteger8Cap on per-sequence step count. Larger sequences are truncated.
autoSuggest.useLlmBodyboolfalseWhen true, fire a small LLM call to write a purpose-specific skill body for staged candidates instead of the fixed template. Cost: ~1k tokens per staged candidate. Falls back to template on failure.

Coverage check

Before staging, the suggester reads the skills_index table and skips any candidate where every step's tool name appears in an existing skill's name or description. Conservative — a false-positive coverage just suppresses one candidate, never silently rewrites anything.

Skill auto-refine

After a turn that loaded a skill into context AND ran ≥ minToolCalls tool calls, fire a two-round LLM pass to decide whether the skill body could be tightened or expanded based on what just happened.

Round 1 returns a JSON assessment: {should_patch, confidence, reason}. Round 2 only fires when confidence ≥ confidenceThreshold and produces the new body. Patches are written atomically (temp + rename), audited via a {name}-CHANGELOG.md sidecar, and persisted to the skill_refinements SQLite table. The skill_refinement count powers a deterministic version (1.{N+1}.0).

Off by default — costs roughly two small LLM calls per qualifying turn, plus disk writes to your skill files.

Configuration

Config path: agents.defaults.skillRefine

[agents.defaults.skillRefine]
enabled = false
confidenceThreshold = 0.7
minToolCalls = 3
maxTokens = 800
FieldTypeDefaultDescription
enabledboolfalseMaster switch.
confidenceThresholdfloat0.7Round 2 only fires when round 1's confidence is at or above this value.
minToolCallsinteger3Skip refinement unless the just-completed turn ran at least this many tool calls.
maxTokensinteger800Maximum response tokens for both rounds. Round 1 is internally capped at 400.

What gets patched

The candidate skill is the highest-priority hit from ContextBuilder::select_skills_for_query against the trigger user message — the same selection logic the system prompt uses, so a turn's "active skill" matches what the LLM actually saw. Recent CHANGELOG entries are injected into the round-1 prompt so the model skips re-patching gaps that earlier sessions already addressed.

Activity journal

Append-only NDJSON timeline at <workspace>/activity_journal.ndjson. Every user inbound and agent outbound is written as one JSON line with a UTC timestamp, session key, role (user/agent/system), and content. The query_activity tool is registered only when this is enabled.

The journal is never auto-rotated. Operators rotate or archive the file manually when it grows beyond their taste — deletion is safe; a fresh file is created on the next write.

Configuration

Config path: agents.defaults.activityJournal

[agents.defaults.activityJournal]
enabled = false
maxContentChars = 512
defaultWindowMinutes = 60
maxWindowMinutes = 1440
FieldTypeDefaultDescription
enabledboolfalseMaster switch. Appends one line per user inbound and one per agent outbound.
maxContentCharsinteger512Truncate stored content (UTF-8 boundary respected) to this many chars. Truncated entries get a trailing .
defaultWindowMinutesinteger60Default half-window when the agent doesn't pass window_minutes.
maxWindowMinutesinteger1440Hard cap on window_minutes accepted from the LLM. Bounds disk reads.

Daily maintenance

An always-on 24-hour ticker runs run_hygiene + cleanup_workspace_files regardless of these flags. When trajectory is enabled, it also performs trajectory compression on dormant sessions older than compressAfterDays. The first pass runs at startup; subsequent passes fire 24h apart.

LLM-as-Judge

Poison-resistant semantic gate for tool calls, adopted from IronClaw PR #2845. Fires after the operator approval workflow but before registry.execute: a small LLM looks at (tool_name, args, user_intent) and returns {verdict: allow|block, reason: …}. When the verdict is block, the tool call is rejected with the reason surfaced to the agent as a tool error so it can re-plan.

The judge sees only the tool name, the (credential-scrubbed) args, and the user's original message. It does NOT see the conversation history or prior tool results — including those would let an attacker poison the judge with the same injection that poisoned the agent.

Fail-open by default: timeouts, provider errors, and malformed JSON all default to allow. The judge is defense-in-depth, not the only gate — silent fail-open keeps a flaky sidecar from bricking the agent.

Configuration

Config path: agents.defaults.judge

[agents.defaults.judge]
enabled = false
maxTokens = 200
timeoutSeconds = 5
allowedTools = []
blockedTools = []
FieldTypeDefaultDescription
enabledboolfalseMaster switch. Costs one small LLM call per covered tool dispatch when on.
maxTokensinteger200Per-call response token cap. Bound the verdict length.
timeoutSecondsinteger5Hard ceiling on the judge LLM call. Beyond this, fail-open.
allowedToolsstring[][]When non-empty, only these tools get judged. Use to roll out per-tool. Mutually exclusive with blockedTools (allow wins).
blockedToolsstring[][]Tools that never get judged even when enabled = true. Useful for high-volume read-only tools where the cost outweighs the safety win.

Provider/model

The judge uses the main agent provider/model — there's no separate task override today. If you want a cheaper/dedicated judge model, request it. Cost is bounded by maxTokens + timeoutSeconds regardless.

Recommended rollout

  1. Leave enabled = false until you have a baseline.
  2. Turn on for high-risk tools first via allowedTools = ["exec", "write_file", "send_message"].
  3. Watch oxicrab_judge_blocked_total (planned metric) and false-positive reports for a week.
  4. Expand the allowlist as confidence grows.

The judge sits between the operator approval workflow and tool execution: exfiltration guard → MCP allowlist → operator approval → judge → param validation → registry.execute.

Memory promotion (recall-driven)

Memory entries that prove useful in retrieval should outlive the 180-day daily-note retention window. The promotion pass scans memory_search_hits over a lookback window for daily: entries that were retrieved frequently across distinct queries, then rewrites their source_key from daily:<date>... to knowledge:auto:<date>... — the knowledge: prefix is exempt from retention purge.

Adopted from openclaw's recordShortTermRecalls + short-term-promotion pattern. The data was already collected in memory_access_log; this just feeds it back. Runs during the daily maintenance ticker.

Configuration

Config path: agents.defaults.memory.promotion

[agents.defaults.memory.promotion]
enabled = false
minRecalls = 5
minUniqueQueries = 2
daysBack = 30
FieldTypeDefaultDescription
enabledboolfalseMaster switch. Off by default — operators should verify their oxicrab stats search output looks healthy first.
minRecallsinteger5An entry must have appeared in this many search results across the lookback window to qualify.
minUniqueQueriesinteger2Across at least this many distinct queries — protects against one popular query dominating the signal.
daysBackinteger30Lookback window for the recall histogram.

LLM request timeout

Hard timeout on each LLM provider call. Without it, a hung provider holds the per-session processing lock indefinitely — the channel goes silent forever. Adopted from nanobot PR #3428.

Config path: agents.defaults.llmRequestTimeoutSeconds (top-level under [agents.defaults]). Default 300 (5 minutes). Set to 0 to disable — only do that when you're sure your provider will never hang or you're testing latency under stress.

[agents.defaults]
llmRequestTimeoutSeconds = 300

On timeout the loop synthesises an error result, releases the session lock, and the channel is unblocked. The next inbound message is processed normally.

Context Providers

External shell commands that inject dynamic content into the system prompt each turn. Each provider runs its command, caches the output with a TTL, and appends it under a # Dynamic Context header.

Config path: agents.defaults.contextProviders

[[agents.defaults.contextProviders]]
name = "Git Status"
command = "git"
args = ["status", "--short"]
enabled = true
timeout = 5
ttl = 60
requiresBins = ["git"]
requiresEnv = []
FieldTypeDefaultDescription
namestringrequiredSection header in the system prompt
commandstringrequiredExecutable to run
argsstring[][]Command arguments
enabledbooltrueEnable or disable this provider
timeoutu645Execution timeout in seconds
ttlu64300Cache lifetime in seconds before re-executing
requiresBinsstring[][]Required binaries (skipped if any missing)
requiresEnvstring[][]Required environment variables (skipped if any missing)

Providers that fail, time out, or have missing dependencies are silently skipped — they never block the agent loop.

Validation: Commands must be non-empty, contain no control characters, and not include path separators. This prevents injection through crafted command strings.

Gateway

Controls the HTTP gateway server used by oxicrab gateway. Config path: gateway

[gateway]
enabled = true
host = "127.0.0.1"
port = 18790
apiKey = "your-secret-api-key"

[gateway.webhooks.github]
enabled = true
secret = "your-hmac-secret"
template = "GitHub {{action}} on {{repository.full_name}}: {{body}}"
agentTurn = true

[[gateway.webhooks.github.targets]]
channel = "slack"
chatId = "C12345"
FieldTypeDefaultDescription
enabledbooltrueEnable or disable the HTTP gateway server
hoststring127.0.0.1Bind address for the gateway HTTP server
portu1618790Port for the gateway HTTP server
apiKeystring""API key for authenticating /api/chat and A2A task endpoints. Requests must include Authorization: Bearer <key> or X-API-Key: <key>. Minimum 32 characters when host is non-loopback. When empty and host is non-loopback, a startup warning is emitted. Health, webhooks (HMAC), and A2A discovery are always public.
webhooksobject{}Named webhook receivers (see below)
a2aobject{}Agent-to-Agent protocol configuration (see below)
rateLimitobject{}Per-IP rate limiting configuration (see below)

HTTP API Endpoints

EndpointMethodDescription
/api/chatPOSTSend a message and receive the agent's response. Body: {"message": "...", "session_id": "..."}
/api/healthGETHealth check. Returns {"status": "ready"/"starting", "version": "..."}
/api/statusGETSystem status: models, tools, channels, tokens, cron, safety, gateway, memory. Auth-gated, rate-limited.
/statusGETHTML status dashboard. Public, auto-refreshes every 60s. Fetches data from /api/status.
/api/webhook/{name}POSTReceive a webhook from an external service (see webhook config below)
/.well-known/agent.jsonGETA2A AgentCard (when A2A enabled)
/a2a/tasksPOSTSubmit an A2A task. Body: {"message": "..."}
/a2a/tasks/{id}GETGet A2A task status and result

Agent-to-Agent (A2A) Protocol

Google's A2A protocol for agent discovery and interoperability. When enabled, exposes an AgentCard at /.well-known/agent.json and a task lifecycle at /a2a/tasks.

[gateway.a2a]
enabled = true
agentName = "My Agent"
agentDescription = "A helpful AI assistant"
FieldTypeDefaultDescription
enabledboolfalseEnable A2A protocol endpoints
agentNamestring""Agent name in the AgentCard
agentDescriptionstring""Agent description in the AgentCard

Tasks are processed through the same agent loop as chat messages. The task lifecycle: submittedworkingcompleted (or failed). Poll GET /a2a/tasks/{id} to check status.

Rate Limiting

Config path: gateway.rateLimit

Per-IP rate limiting for all gateway endpoints. Uses a token bucket algorithm with configurable sustained rate and burst capacity.

[gateway.rateLimit]
enabled = true
requestsPerSecond = 10
burst = 20
trustProxy = true
trustedProxies = ["10.0.0.0/8", "192.168.0.0/16"]
FieldTypeDefaultDescription
enabledboolfalseEnable per-IP rate limiting
requestsPerSecondu3210Sustained request rate per IP
burstu3220Maximum burst capacity per IP
trustProxyboolfalseTrust X-Forwarded-For header for client IP extraction. Enable only when running behind a reverse proxy.
trustedProxiesstring[][]Exact IPs or CIDRs allowed to supply X-Forwarded-For. Required when trustProxy is enabled.

When a client exceeds the rate limit, the gateway returns HTTP 429 with a Retry-After header indicating when to retry.

Webhook Configuration

Each entry in webhooks creates a receiver at POST /api/webhook/{name}. Payloads are validated with HMAC-SHA256 signature verification (constant-time comparison).

FieldTypeDefaultDescription
enabledbooltrueEnable or disable this webhook endpoint. Disabled webhooks return 404
secretstringHMAC-SHA256 secret for signature validation. Minimum 32 characters.
templatestring{{body}}Message template. Use {{key}} for JSON payload fields, {{body}} for raw body
targetsarray[]Delivery targets: [{"channel": "slack", "chatId": "C12345"}]
agentTurnboolfalseIf true, routes through the agent loop before delivering to targets

Signature headers checked: X-Signature-256, X-Hub-Signature-256, X-Webhook-Signature. Supports sha256= prefix (GitHub-style). Max payload: 1 MB.

Set host to "0.0.0.0" to listen on all interfaces (required for Docker/container deployments). The Twilio channel uses this same gateway for its webhook listener.

Webhook Dispatch

Each webhook can optionally include a dispatch object for structured direct tool execution, bypassing the LLM entirely.

FieldTypeDefaultDescription
dispatch.toolstringTool name to execute directly
dispatch.paramsTemplateobjectJSON parameters template with {{key}} substitution from the webhook payload

When dispatch is present, the webhook payload is parsed and template variables are substituted into the params, then the tool is executed directly without LLM involvement. Mutually exclusive with agentTurn.

Observability

Controls runtime telemetry exporters. Config path: observability

[observability.metrics]
enabled = true
bind = "127.0.0.1:9901"
FieldTypeDefaultDescription
metrics.enabledboolfalseEnable Prometheus metrics exporter
metrics.bindstring127.0.0.1:9901HTTP bind address for /metrics endpoint

When enabled, oxicrab installs a process-wide Prometheus recorder and serves metrics at http://{bind}/metrics. This includes router counters/histograms such as route decisions, policy drift, semantic confidence, and blocked tool attempts.

Router

Controls the message router that pre-classifies inbound messages before LLM involvement. Config path: router

[router]
prefix = "!"
semanticTopK = 3
semanticPrefilterK = 12
semanticThreshold = 0.5

[[router.rules]]
trigger = "weather"
tool = "weather"

[router.rules.params]
action = "forecast"
location = "$1"

[[router.rules]]
trigger = "remind"
tool = "cron"

[router.rules.params]
action = "add"
message = "$*"
FieldTypeDefaultDescription
prefixstring"!"Command prefix character for prefix commands. Avoids collision with Slack/Discord slash commands.
rulesarray[]User-defined prefix commands. Each rule has trigger (command word), tool (tool name), params (JSON with $1/$2/$* substitution for positional arguments).
semanticTopKusize3Maximum number of tools retained when semantic filtering is applied to unconstrained turns.
semanticPrefilterKusize12Lexical prefilter candidate size before semantic reranking. Must be >= semanticTopK.
semanticThresholdf320.5Minimum semantic score required to keep a candidate tool. Values are clamped to [-1.0, 1.0].

The message router runs at the top of message processing and chooses deterministic dispatch first, constrained LLM second, and full LLM last. Prefix commands (e.g. !weather London) are dispatched directly to the named tool without an LLM call.

Each route emits a strict policy object used by execution: allowed_tools, blocked_tools, and reason. Tool execution enforces this policy unconditionally. For diagnostics, use !router_replay [n] (alias: !route_replay [n]) to view route decisions for recent turns in the current session.

Sandbox

Kernel-enforced filesystem and network restrictions applied to both shell commands (tools.exec.sandbox) and MCP server child processes (tools.mcp.servers.*.sandbox). On Linux, uses Landlock LSM. On macOS, uses Seatbelt (sandbox_init). Graceful no-op on unsupported platforms or older Linux kernels.

[tools.exec.sandbox]
enabled = true
additionalReadPaths = ["/opt/data"]
additionalWritePaths = ["/home/user/output"]
blockNetwork = true
FieldTypeDefaultDescription
enabledbooltrueEnable kernel-level process sandboxing (Landlock on Linux, Seatbelt on macOS)
additionalReadPathsstring[][]Extra paths to grant read-only access (beyond /usr, /lib, /lib64, /bin, /sbin, /etc)
additionalWritePathsstring[][]Extra paths to grant read-write access (beyond workspace + /tmp + /var/tmp)
blockNetworkbooltrueBlock all outbound TCP connections from shell commands

Linux (Landlock): Default read-only: /usr, /lib, /lib64, /bin, /sbin, /etc. Default read-write: workspace dir, /tmp, /var/tmp. Degrades gracefully on older kernels via BestEffort mode.

macOS (Seatbelt): Same default paths plus macOS-specific system paths (/System, /Library, /opt/homebrew, /usr/local) for read-only, and symlink targets (/private/tmp, /private/var/folders) for read-write. Also grants process execution, Mach IPC, and signal operations required for child processes.

All other filesystem access and network connections are denied. Use oxicrab doctor to check sandbox availability on your system.

Channels

Per-channel configuration under channels.{name}. See Channel Setup for step-by-step guides. All channels share these common fields:

FieldTypeDefaultDescription
enabledboolfalseEnable this channel
allowFromstring[][]Authorized sender IDs. Empty = deny-all. Use ["*"] for open access.
allowGroupsstring[][]Restrict which groups/channels the bot responds in. Empty = deny-all. Use ["*"] for open access. Non-empty = only listed group IDs.
dmPolicystring"allowlist"DM access policy: "allowlist", "pairing", or "open"

dmPolicy

Controls what happens when an unrecognized sender messages the bot on a channel.

ValueBehavior
"allowlist"Check allowFrom + pairing store. Silently drop unrecognized senders. This is the default and preserves the original behavior.
"pairing"Check allowFrom + pairing store. If unknown, generate an 8-character pairing code and send it to the sender. The bot owner can then approve with oxicrab pairing approve {code}.
"open"Allow all senders unconditionally. No access checks are performed.

See Channel Setup → Common patterns for a detailed walkthrough and access check flowchart.

Channel-specific fields

Each channel has additional required fields. See Channel Setup for the complete config blocks. Quick reference:

ChannelRequired Fields
telegramtoken. Optional: mentionOnly (boolean, default false) — only respond in groups when bot is @mentioned or replied to
discordtoken. Optional: mentionOnly (boolean, default false) — only respond in guilds when bot is @mentioned
slackbotToken, appToken. Optional: thinkingEmoji (default "eyes"), doneEmoji (default "white_check_mark")
whatsapp(none — scan QR on first run). In groups the bot only responds when it is mentioned (mentioned_jid) or quote-replied to; this is not configurable. Use a 1:1 chat (DM) for unconditional bot participation. Note: the bot uses your own WhatsApp identity, so mentions/replies aimed at you-the-human in a group will also wake it — see channel setup for the recommended fix.
twilioaccountSid, authToken, phoneNumber, webhookPort, webhookPath, webhookUrl (required). Optional: webhookHost (string, default "127.0.0.1") — interface to bind the webhook server; allowGroups (array, default []) — restrict to specific Conversation SIDs

Logging

Logging is controlled by the RUST_LOG environment variable. Oxicrab uses the tracing-subscriber format.

# Default: info level, noisy dependencies suppressed
./target/release/oxicrab gateway

# Debug logging
RUST_LOG=debug ./target/release/oxicrab gateway

# Custom filtering
RUST_LOG=info,whatsapp_rust=warn,oxicrab::channels=debug ./target/release/oxicrab gateway

Common filters:

Config Validation

oxicrab validates configuration at startup and rejects invalid settings with actionable error messages. These checks run after all config layers are merged.

RuleDetails
Webhook secretsMust be at least 32 characters
Gateway API keyMust be at least 32 characters when host is non-loopback
Approval timeoutMust be at least 10 seconds when approval is enabled
Context provider commandsMust be non-empty, contain no control characters, and not include path separators
Provider custom headersReserved names are blocked: Authorization, Content-Type, x-api-key, and other internal headers
Twilio webhookUrlRequired for the Twilio channel to start
Shell allowedCommandsA startup warning is emitted when allowedCommands is empty (unrestricted shell access)
Tool name shadowingBuilt-in tools cannot be overridden by MCP or runtime-registered tools

Resource Limits

Hard caps applied at the boundaries where attacker- or LLM-controlled input enters the process. None are tunable today — they exist to prevent OOM and runaway-payload conditions, not to be policy knobs.

BoundaryCap
Inbound message (MessageBus::publish_inbound)1 MB — longer messages are truncated
Gateway request body (chat, webhook, A2A)1 MB via DefaultBodyLimit
Webhook payload1 MB
HTTP response bodies (HTTP / web tools)10 MB via limited_body()
HTML extracted by browser tool500 KB
Browser screenshot height10080 px clamp
Audio uploads (cloud transcription)25 MB
Image generation base64 payload30 MB pre-decode check
Context files (USER.md, TOOLS.md, AGENTS.md)500 KB each
Skill file ({name}.md)1 MB each
Context provider output100 KB per call
Compaction summary2000 chars (prevents unbounded growth across cycles)