Local-first, personal AI coding assistant CLI focused on local-first workflows, modular architecture, privacy, and real coding actions.
Version 0.9.41
thunk is a Rust-based personal AI coding assistant built around a small, explicit runtime:
- a terminal UI for chat and control commands
- a runtime that owns conversation state and tool dispatch
- a tool layer with typed inputs and outputs
- SQLite-backed session persistence
- swappable model backends
The project is structured to keep model generation, tool execution, persistence, and UI separate instead of folding everything into one text-driven loop.
- Runtime-owned correctness, not prompt-driven behavior
- Structural execution instead of relying on model reasoning alone
- Grounded code investigation via enforced search → read → answer flow
- Explicit tool surface constraints per turn
- Deterministic correction and terminal outcomes
- Designed to remain correct even with small or imperfect local models
- Built for local-first, low-resource environments
- Runs as a local terminal app with an alternate-screen TUI.
- Supports two model backends:
mockandllama_cpp. - Builds a system prompt from the app name, project root, and registered tool specs.
- Streams assistant output into the conversation while emitting UI-facing runtime events.
- Parses tool calls centrally in
src/runtime/tool_codec.rs. - Executes read-only tools immediately and pauses for approval before mutating files.
- Re-enters model generation after tool results so the assistant can synthesize a grounded same-turn answer.
- Uses runtime-owned terminal answers when the runtime already knows the outcome, such as rejected mutations or failed file reads.
- Enforces bounded per-turn
search_codebehavior at runtime instead of relying only on prompt wording. - Persists sessions in
data/sessions.dband restores the most recent same-root session on startup. - Writes best-effort per-session logs under
logs/.
Current built-in tools:
read_filelist_dirsearch_codeedit_filewrite_file
Current control commands:
/help/clear/quit/approve/reject
At a high level:
- The user submits a prompt in the TUI.
- The runtime sends the full in-memory conversation to the active model backend.
- The assistant response is scanned for tool calls.
- Tool calls are dispatched in document order.
- Immediate tool results are injected back into the conversation as runtime-owned result blocks.
- The runtime normally re-enters generation with those results so the model can answer from actual tool output.
- If a mutating tool proposes a change, the runtime stores a single
PendingActionand waits for/approveor/reject.
Some outcomes are deliberately terminal and runtime-owned: rejecting a pending mutation produces a cancellation answer without asking the model to summarize, and a failed read_file can end cleanly without retrying in a loop.
search_code is a literal substring search. The runtime now simplifies model-generated search phrases into a single literal keyword and enforces a per-turn budget: one search is allowed, a second search is allowed only when the first returned no matches, and later search attempts are blocked with a correction so the model must answer cleanly.
The runtime enforces a structured investigation loop rather than relying on the model to behave correctly on its own.
At a high level:
- search → read → answer gating is enforced per turn
- evidence must be established before synthesis is allowed
- tool usage is restricted by per-turn tool surfaces
- after evidence is accepted, further tool calls are blocked
- repeated violations result in runtime-owned terminal outcomes
This allows the system to remain correct and predictable even when the model makes mistakes or attempts invalid actions.
The codebase is split into six main layers:
src/app/— startup, config, paths, session orchestrationsrc/runtime/— conversation loop, tool parsing, approval state, runtime eventssrc/tools/— tool contracts, registry, and implementationssrc/storage/— SQLite session storagesrc/llm/— backend abstraction and providerssrc/tui/— terminal input, rendering, and slash commands
Key architectural rules reflected in the code:
- parsing of raw tool syntax lives in
runtime/tool_codec.rs - tools operate on typed
ToolInput/ToolOutput, not raw model text - mutating tools separate
run()fromexecute_approved() - the runtime does not depend on the TUI or SQLite directly
- the TUI renders events but does not execute tools
- No shell, git, web, or external integration tools yet.
- No LSP integration or advanced memory system.
- No token-aware live context budgeting before generation.
- Pending approvals are not persisted across restarts.
- Restored session history is loaded into the runtime, but not replayed into the visible TUI transcript.
- Tool UI is compact and text-based; there is no diff view or expandable preview UI yet.
- Performance is currently dominated by repeated model rounds and prompt prefill.
- No bounded answer synthesis yet after evidence is ready (planned).
- No prompt caching or context compression yet.
Build and install to PATH:
cargo build --release
cargo install --path .Once installed, run from any project directory:
cd /your/project
thunkthunk walks upward from the current directory to find config.toml and .git. Copy config.toml.example to your project root and edit model_path to point to your local .gguf model.
Requirements:
- Rust stable
- Interactive terminal (
stdoutmust be a TTY andTERMmust not bedumb) - A local
.ggufmodel if usingllama_cpp
Run during development:
cargo runRun tests:
cargo testConfiguration lives in config.toml. See config.toml.example for all available options.
llm.provider = "mock"uses the built-in mock backend.llm.provider = "llama_cpp"uses the local llama.cpp backend.llama_cpp.model_pathpoints to the local.gguffile to load.
| Section | Description |
|---|---|
| Architecture | Code-accurate system architecture and runtime behavior |
| Runtime | Focused overview of the runtime loop, events, and approval flow |
| Tools | Current tool contract, registry model, and built-in tool behavior |
| Sessions | Session storage, restore behavior, and persistence limits |
| Setup | Requirements, run/test commands, and config basics |
| Benchmarks | Performance notes and measurements |