Overview
oxicrab stores all its state in a home directory, defaulting to ~/.oxicrab/. The workspace is a subdirectory that holds the markdown files that define the agent's identity, memory, and context. These files are loaded into the system prompt on every conversation.
The system prompt always begins with the current date and time in natural language (e.g. "The current date and time is Thursday, March 6, 2026 at 2:15 PM EST"), giving the agent reliable temporal awareness for scheduling, time-sensitive queries, and relative date references.
Run oxicrab onboard to scaffold the full directory structure with templates.
OXICRAB_HOME environment variable, or change the workspace path in config.toml under agents.defaults.workspace.AGENTS.md template
Defines the agent's identity, personality, behavioral rules, and capabilities. This is the most important workspace file — it shapes how the agent responds to everything.
How it's used
Loaded as the core identity section of the system prompt. Unlike other bootstrap files, AGENTS.md is loaded separately and placed first, before USER.md or TOOLS.md. If the file is missing, oxicrab falls back to a built-in default identity.
Recommended sections
- Personality — Tone and communication style (e.g. "friendly but professional", "concise")
- Capabilities — Brief overview of what the agent can do (no need to list every tool)
- Behavioral Rules — Constraints like "never guess information", "ask for clarification when ambiguous"
- Action Integrity — Rules about never claiming to have done something without actually calling a tool
- Memory Management — How aggressively the agent should write to memory
- Learned Adaptations — A section the agent updates itself as it learns user preferences
Example
# mybot
I am mybot, a personal AI assistant.
## Personality
- Friendly but professional
- Direct and concise, with detail when needed
- Accuracy over speed
## Capabilities
I have access to tools including file operations, web search,
shell commands, subagents, and more. Some tools require
additional configuration.
## Behavioral Rules
- Reply directly to questions. Your text response will be
delivered to the user automatically.
- Never invent or guess information.
- Ask for clarification when ambiguous.
## Learned Adaptations
*(Updated automatically as I learn preferences)*
USER.md template
Stores information about the user — preferences, timezone, communication style, and anything else the agent should know about you. The agent may also update this file as it learns your patterns.
How it's used
Loaded as a bootstrap file in the system prompt, after AGENTS.md. This gives the agent persistent context about who it's talking to across all sessions.
Example
# User
## Preferences
- Communication style: casual
- Timezone: America/New_York
- Language: English
## Notes
- Prefers metric units
- Works primarily with Python and Rust
TOOLS.md template
A place to record notes about which tools are configured, any quirks or important details, and which services are connected. This is not where tool configuration lives (that's in config.toml) — it's context for the agent.
How it's used
Loaded as a bootstrap file alongside USER.md. Helps the agent understand what tools are available and any special instructions for using them.
Example
# Tool Notes
## Configured Tools
- Google Mail and Calendar are connected (OAuth)
- Weather uses metric units, default location: London
- Todoist is set up for personal project tracking
## API Keys & Services
- Brave Search: configured
- OpenWeatherMap: configured
- GitHub: configured via PAT
Memory Database automatic
All long-term memory is stored directly in a SQLite database. The agent writes facts, user preferences, and conversation context here automatically. The memory_search tool queries this database.
How it's used
Recent memory entries are included in the system prompt for session continuity. The memory_search tool queries the database using SQLite FTS5 full-text search, with optional hybrid vector+keyword search when embeddings are enabled.
What gets stored
- Important facts the agent should always know
- User preferences discovered over time
- Extracted context from conversations (automatic fact extraction during compaction)
- Quick notes via the "remember that..." fast path
Quality gates
Before writing to memory, content passes through quality gates that reject greetings, filler, and very short content. Negative memories are automatically reframed to be constructive unless they already contain resolution markers.
Configuration
{
"agents": {
"defaults": {
"memory": {
"embeddingsEnabled": true,
"embeddingsModel": "BAAI/bge-small-en-v1.5",
"hybridWeight": 0.5,
"searchFusionStrategy": "weighted_score",
"rrfK": 60,
"embeddingCacheSize": 10000,
"recencyHalfLifeDays": 90
}
}
}
}
Hybrid vector+keyword search is enabled by default. The embeddings model is downloaded automatically on first use.
Skills manual
Custom skills extend the agent's capabilities with domain-specific instructions. Each skill is a directory containing a {skill-name}.md file (matching the directory name) with YAML frontmatter and markdown documentation.
How it's used
Every skill's name, description, and trigger keywords appear as a compact summary in the system prompt (always). Full skill content is only loaded when the inbound message matches a hint keyword, using fast Aho-Corasick multi-pattern matching. This keeps the base prompt small while ensuring relevant skills are available when needed.
Skill file format
---
name: my-skill
description: What this skill does
emoji: "\U0001f527"
schedule: "7am, 5pm"
hints:
- ffmpeg
- transcode
- video convert
requires:
bins: ["ffmpeg", "curl"]
env: ["MY_API_KEY"]
---
# My Skill
Instructions for the agent on how to use this skill.
## When to use
Describe the situations where this skill applies.
## Steps
1. Step one
2. Step two
Frontmatter fields
- name — Skill identifier (matches directory name)
- description — One-line summary shown in the skills list
- emoji — Display emoji for the skill in the system prompt summary and startup log. Defaults to 🔧 if omitted.
- schedule — Execution times in 12h or 24h format, comma-separated for multiple times (e.g.
"7am","9am, 1pm, 5pm","7:30am, 17:00"). Creates cron jobs automatically at startup. The cron job's message triggers the hint matcher, loading and executing the full skill content. - hints — List of trigger keywords. When a message contains any of these words, the full skill content is loaded into the system prompt. If omitted, keywords are auto-extracted from the skill name and description.
- requires.bins — CLI tools that must be installed (checked at runtime)
- requires.env — Environment variables that must be set
Sessions
Conversation history files, one per session. Stored in JSONL format (one JSON object per line). Sessions are keyed by channel and chat ID.
How it's used
The agent loads the most recent messages from the session file to maintain conversation continuity. An LRU cache of 64 sessions is kept in memory for performance.
Lifecycle
- Auto-pruned to the 200 most recent messages
- Expired after
sessionTtlDays(default: 30 days) - Compaction available to summarize long conversations (configurable via
compactionsettings)
Other Files
These files live in the oxicrab home directory (~/.oxicrab/) rather than the workspace.
Main configuration file. TOML with camelCase keys. Contains provider API keys, channel tokens, tool settings, and agent defaults. Created by oxicrab onboard. See the Configuration page for a complete field reference.
Downloaded images, screenshots, and other media. Auto-cleaned after mediaTtlDays (default: 7 days). Used by web_fetch, http, and browser tools.