feat(agent): agent modes v1a (Fast / Normal / Deep Research) + hybrid vault_search #1

Merged
hegdeatri merged 13 commits from feature/agent-modes-v1a into master 2026-05-18 16:29:12 +01:00
Owner

Three modes for the agent now: Fast, Normal, Deep Research. Each picks a tool profile and a round budget rather than the one-shot workflow we had. New hybrid vault_search tool does FTS + semantic in parallel, fuses with RRF, dedups by note, reranks on title/tag/recency. Title generation moved off the full agent path onto a separate POST /title so a typical first turn stops paying for tools that wouldn't have fired anyway.

Spec at docs/superpowers/specs/2026-05-17-agent-modes-design.md. v1b (router, fast path, telemetry, streaming Deep Research progress) is a separate spec — intentionally not in this PR.

Budgets

Mode Rounds Notes
Fast 1 "What's in note X" style turns.
Normal 3 Default.
Deep Research 6 Broader catalog, iterative.

When the Obsidian CLI isn't configured, Deep Research silently falls back to the Analysis profile — obsidian_* tools just don't appear in the catalog, agent loop doesn't have to know.

UI

  • Composer: chip between Model and Tools, three-row popover with the active row checked, "Default: X" hint when the thread overrides the default.
  • Settings → Indexer → Status: new default-mode picker next to the existing default-tool-backend one. They should probably move into a dedicated "Chat defaults" section eventually — flagged in TODO.

Backward compat

ChatRequest.agent_mode, AppSettings.default_agent_mode, and ChatSession.agentMode all default to Normal when missing. Existing settings and threads keep working.

Checks

  • bun run lint, bun run build — clean
  • cargo check --all-targets, cargo clippy --all-targets — clean (three pre-existing rust-1.94 transmute warnings in db/store.rs, not from this branch)
  • cargo test --lib — 62/62 passing, including mode_budgets_match_table, profile_for_*, vault_search_*, and the two _deserializes_* cases
Three modes for the agent now: Fast, Normal, Deep Research. Each picks a tool profile and a round budget rather than the one-shot workflow we had. New hybrid `vault_search` tool does FTS + semantic in parallel, fuses with RRF, dedups by note, reranks on title/tag/recency. Title generation moved off the full agent path onto a separate `POST /title` so a typical first turn stops paying for tools that wouldn't have fired anyway. Spec at `docs/superpowers/specs/2026-05-17-agent-modes-design.md`. v1b (router, fast path, telemetry, streaming Deep Research progress) is a separate spec — intentionally not in this PR. ### Budgets | Mode | Rounds | Notes | |---|---|---| | Fast | 1 | "What's in note X" style turns. | | Normal | 3 | Default. | | Deep Research | 6 | Broader catalog, iterative. | When the Obsidian CLI isn't configured, Deep Research silently falls back to the Analysis profile — `obsidian_*` tools just don't appear in the catalog, agent loop doesn't have to know. ### UI - Composer: chip between Model and Tools, three-row popover with the active row checked, "Default: X" hint when the thread overrides the default. - Settings → Indexer → Status: new default-mode picker next to the existing default-tool-backend one. They should probably move into a dedicated "Chat defaults" section eventually — flagged in TODO. ### Backward compat `ChatRequest.agent_mode`, `AppSettings.default_agent_mode`, and `ChatSession.agentMode` all default to Normal when missing. Existing settings and threads keep working. ### Checks - `bun run lint`, `bun run build` — clean - `cargo check --all-targets`, `cargo clippy --all-targets` — clean (three pre-existing rust-1.94 transmute warnings in `db/store.rs`, not from this branch) - `cargo test --lib` — 62/62 passing, including `mode_budgets_match_table`, `profile_for_*`, `vault_search_*`, and the two `_deserializes_*` cases
7-step pipeline: FTS5 + semantic → RRF fuse (k=60) → max-dedup by
note → heuristic rerank (title/tag boost, recency decay) → cap →
snippet. Pure-function phases unit-tested with fixtures; DB query
helpers exercised via manual smoke until integration tests land.

Pipeline structs and helpers keep `#[allow(dead_code)]` until Task 5
registers VaultSearchTool with the agent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single register_tools(builder, profile, ctx, budgets) function will
replace the three near-identical .tool() chains in providers.rs
(one per ProviderKind arm — Ollama, OpenRouter, LmStudio|Custom).
tool_names_for is the introspection helper test code asserts against,
kept in lockstep with register_tools by code review.

Adjusted from plan: Rig 0.33's AgentBuilder has a NoToolConfig →
WithBuilderTools type-state. The None profile transitions via
.tools(Vec::new()) so all branches return WithBuilderTools.

`#[allow(dead_code)]` on ToolProfile/profile_for/tool_names_for/
register_tools stays until Task 7 wires providers.rs — the plan's
strip-allows step was premature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
build_system_prompt now takes (conn, note_count, AgentMode, ToolProfile,
obsidian_configured, schema). The catalog is built from a static
CatalogRow slice filtered by tool_names_for(profile, obsidian_configured),
so it matches register_tools exactly — no risk of advertising tools the
agent doesn't have. Strategy block keyed on AgentMode (Fast/Normal/
DeepResearch). Stable-prefix-first structure preserved; new
strategy_block_sits_in_stable_prefix test pins this.

obsidian_configured is passed explicitly (not derived from profile) to
avoid a catalog-lies-to-the-model bug if ObsidianLive ever degrades to
Analysis at registration time. Reviewed pre-commit by code-reviewer
agent — see /tmp/task-6-7-adapted-plan.md for the deltas.

NOTE: providers.rs does not compile after this commit; Task 7 (next
commit) rewires the call site.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the three near-identical .tool() chains (one per ProviderKind
arm) with a single register_tools(builder, profile, &ctx, budgets) call
in each has_vault branch — ~75 LOC of duplication collapse to ~3 per
arm. create_chat_stream takes AgentMode and resolves profile, budgets,
and ToolContext once up front; effective_max_rounds prefers an explicit
caller override over the mode budget.

obsidian_configured is computed from vault_path (settings-only, no DB
read) and threaded into both profile_for and build_system_prompt so
the system prompt's catalog cannot lie about which tools are
registered. has_vault still gates actual registration on note_count.

The legacy tool_backend parameter stays on the signature for backward
compatibility (frontend still sends it); v1b removes it. The /chat
route handler hard-codes AgentMode::Normal — Task 8 wires the
per-request resolution from ChatRequest.

Cleanup:
- Strip Task 2/5 #[allow(dead_code)] from ToolProfile, profile_for,
  tool_names_for, register_tools — they have a real caller now.
- Annotate ToolProfile::None with #[allow(dead_code)] until Task 9
  /title endpoint lights it up.
- #![allow(dead_code)] on agent/tools/obsidian/{search,read,tags}.rs;
  vault_search replaces obsidian_search and read_note/list_tags cover
  the others. v1b decides whether to delete the modules outright.
- #[allow(clippy::too_many_arguments)] on create_chat_stream; future
  refactor can bundle args into a struct.

Defers generate_once helper to Task 9 (its first real consumer).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
clippy::manual_repeat_n flagged the `repeat("?").take(n)` pattern in
vault_search::query_note_meta after the MSRV bump made repeat_n
available.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Optional, serde-default to None. Handler resolves:
  body.agent_mode.unwrap_or(settings.default_agent_mode)
Matches the ToolBackend / max_tool_rounds per-request precedent.
Frontend payloads that don't send the field continue to work — they
fall back to AppSettings.default_agent_mode (defaults to Normal).

Adds two pure-serde unit tests on ChatRequest covering the missing
field and the "fast" / "deep_research" wire shapes — locks in the
contract that Task 10's chat-client.ts will mirror.

AGENTS.md /chat docs gain a parallel sentence so the agent_mode
override surface stays discoverable.
generate_once builds a no-tools agent per provider arm, streams it,
accumulates TextDelta chunks into a String. Thinking / reasoning
deltas dropped so reasoning-prefix models (DeepSeek-R1, Qwen3-thinking)
return clean output. Streaming errors surface as
AppError::ProviderError.

POST /title takes {message, history?}, returns {title}. Handler
guards empty/whitespace input by returning {title: ""} instead of
hitting providers (Rig rejects empty user messages). clean_title
strips surrounding + embedded quotes (titles shouldn't quote
themselves; embedded apostrophes lost — acceptable for short titles,
documented), trims terminal .!?, collapses whitespace.

Deviation from spec: title prompt is passed as system preamble +
real user message rather than spec's "everything in preamble, empty
user message" — matches the create_chat_stream pattern in this file
and avoids the empty-message risk the spec itself flagged.

Tests: clean_title rules (incl. empty-string + adversarial
all-punctuation cases), TitleRequest serde shape (with + without
history), TitleResponse wire shape. AGENTS.md gains /title route entry.

Frontend swap to this endpoint lands in Task 11.
- AgentMode = "fast" | "normal" | "deep_research" (mirrors Rust enum).
- ChatHistoryMessage extracted from streamChat's inline shape into an
  exported interface so TitleRequest / ChatRequest / useChat can share.
- ChatRequest exported as documentation-only wire-format type
  (streamChat keeps its positional signature for callsite stability
  but builds the body inline using the same fields).
- ChatStreamOptions gains agentMode; streamChat sets body.agent_mode
  when defined and otherwise lets the backend fall back to
  settings.default_agent_mode (matches the existing maxToolRounds
  ?-pattern).
- requestTitle(message, history?) wraps POST /title. JSDoc documents
  the contract: returns "" (not an error) on blank input; throws on
  non-2xx — caller falls back as appropriate.
- AppSettings gains optional default_agent_mode, mirroring the
  ?-optional default_tool_backend precedent for legacy-JSON
  tolerance. Task 11/13 callers fall back via `?? "normal"`.

Legacy generateTitle untouched — Task 11 swaps useChat over to
requestTitle, after which generateTitle can be removed.
- ChatSession gains optional agentMode (mirrors existing toolBackend?).
  Inline literal type so mock-data.ts stays free of chat-client.ts
  imports — keeps the dependency direction one-way.
- draftAgentMode / setSessionAgentMode parallel the toolBackend story:
  hold the user's choice in hook-local state for the draft session,
  promote to the new session on first send.
- Mode resolution: session.agentMode ?? settings.default_agent_mode
  ?? "normal". Threaded through both sendMessage and retryMessage
  into beginStream, then into streamChat's options bag (6th arg).
- Title path switched from legacy generateTitle (which went through
  /chat) to requestTitle (POST /title — no tools, no SSE). Fired
  fire-and-forget on first-message complete, same timing as before.

The empty-title behavior is intentionally improved: legacy generateTitle
would unconditionally overwrite the slug title even with "" (rare LLM
edge), blanking the thread. The new path keeps the slug fallback
("text.slice(0, 50)") when requestTitle returns the documented
empty-string-on-blank-input. Visible-only on the unlikely path; no
user-facing regression.

generateTitle is now unused inside useChat. Left in chat-client.ts as
dead code for v1b housekeeping — pruning it touches the import graph
and is best done after Task 14 manual smoke confirms /title parity.
The Composer (Task 12) and Settings (Task 13) will consume the new
draftAgentMode / setSessionAgentMode / default_agent_mode surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Composer gains agentMode / defaultAgentMode / onAgentMode props,
  mirroring the existing toolBackend triplet. New chip sits between
  Model and Tools, displays a half-circle icon + short label
  (Fast / Normal / Deep) + caret. Chip lights up "active" when the
  thread overrides the user's configured default.
- ModePopover renders three radio rows (Fast, Normal, Deep Research)
  with one-line descriptions and per-mode icons (Lightning, CircleHalf,
  Binoculars from Phosphor). Reuses the existing unused .mpop-tool-row
  CSS family so no new styles are introduced. Checkmark on the active
  row; "Default: X" footer hint when the thread is on a non-default
  mode. Globals.css gains a one-line comment marking ModePopover as
  the new owner of those classes.
- page.tsx wires draftAgentMode / setSessionAgentMode from useChat
  and resolves the effective mode the same way it resolves
  toolBackend: (isDraft ? draftAgentMode : activeSession.agentMode)
  ?? settings.default_agent_mode ?? "normal".
- The model and tools chip onClicks gain a setShowModes(false) reset
  so the three chips form a proper "only one popover open" peer
  group. Inline comment notes that this peer-reset (not the
  popover's outside-click guard) is the load-bearing close mechanism
  for inter-chip clicks.

No backend touched. Settings default-mode picker (Task 13) and
manual smoke (Task 14) follow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surfaces settings.default_agent_mode in the UI so newly-created chat
threads can inherit a non-Normal default. Mirrors the existing
default-tool-backend picker pattern: three-button .seg radiogroup with
aria-checked + immediate updateSettings() persistence.

Per-button hint copy ("up to N tool rounds") matches the
AgentMode::budgets() constants in src-tauri/src/config.rs
(Fast=1, Normal=3, DeepResearch=6) — quantitative register parallels
the tool-backend's "~600 ms / call" / "<30 ms / call" annotations.

Placement: src/components/pkma/sections/indexer.tsx::StatusPanel,
adjacent to the default-tool-backend picker. Spec text says "or the
settings section that owns the default-tool-backend picker"; that's
StatusPanel, not models.tsx (which uses a draft/save form pattern
that would be heavier to wire for two booleans).

Section hint updated from "Progress, reindexing, and default tool
backend" to "Progress, reindexing, and chat defaults for new threads"
so the section header no longer lies about its contents.

TODO.md gets a v1b polish note: both chat-default pickers should
ultimately move into a dedicated "Chat defaults" section — agent
mode has nothing to do with indexing, and the current home will
surprise the next refactor.

bun run lint clean. bun run build clean.

12 of 14 v1a tasks shipped (Task 13). Remaining: Task 14 manual smoke
+ plan sign-off.
Task 14 — automated end-to-end audit clean:
  - bun run lint: clean
  - bun run build: clean (Next 16 static export, 3 static pages)
  - cargo check --all-targets: clean
  - cargo clippy --all-targets: clean (3 pre-existing warnings in
    src/db/store.rs from rust-1.94's clippy::missing_transmute_-
    annotations; not introduced by this branch)
  - cargo test --lib: 62 passed, 0 failed, 3 ignored, including
    mode_budgets_match_table, profile_for_*, vault_search_*,
    chat_request_deserializes_without_agent_mode,
    app_settings_without_default_agent_mode_deserializes_to_normal

End-to-end agent_mode trace verified (no orphan hardcoded Normal):
  composer.tsx chip
    → useChat.tsx (agentMode resolution before send)
    → chat-client.ts streamChat options.agentMode → body.agent_mode
    → server/routes.rs ChatRequest.agent_mode (#[serde(default)])
      → mode = body.agent_mode.unwrap_or(settings.default_agent_mode)
    → providers.rs create_chat_stream(mode, ...)
      → register_tools(builder, profile_for(mode, obsidian), ctx, budgets)
      → mode.budgets().max_tool_rounds as loop cap (overridable per
        request via max_tool_rounds_override)

/title route wired: server/mod.rs `.route("/title", post(routes::title))`
→ routes::title → providers::generate_once (no tools, no SSE).
useChat.tsx:601 calls requestTitle on first message.

Drops dead generateTitle from chat-client.ts (zero consumers after
Task 11 migrated useChat to requestTitle — the SSE-based title path
was the v1a deliverable being replaced, not v1b housekeeping).

Spec status flipped to "v1a shipped 2026-05-18 (awaiting merge)".
TODO.md Tier 2 v1a checkbox flipped; commit roster updated; v1b
polish items recorded (Chat-defaults section extract; rust-1.94
transmute annotations on sqlite3_auto_extension call sites).

Manual smoke (live Tauri window, real vault, real provider) is the
user's call to drive. Reported separately when run.

14 of 14 v1a tasks shipped.
hegdeatri merged commit 38e8aba6c0 into master 2026-05-18 16:29:12 +01:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
hegdeatri/pkma-rs!1
No description provided.