RFD 009: Stateful Tool Protocol

Status: Discussion
Category: Design
Authors: Jean Mertz git@jeanmertz.com
Date: 2026-02-23
Required by: RFD 010, RFD 011, RFD 037, RFD 058

Summary

This RFD introduces a stateful tool execution protocol that unifies one-shot and long-running tools under a single execution model. Every tool call is internally modeled as a state machine (Running → Stopped), with one-shot tools wrapped transparently. Long-running tools expose this lifecycle to the assistant, enabling multi-step interactive workflows like git add --patch or persistent shell sessions.

Motivation

JP's current tool execution model is synchronous: the assistant calls a tool, JP runs a command, captures the output, and returns it. This works for tools like cargo_check or grep, but breaks down for programs that:

Run indefinitely (shells, file watchers, dev servers)
Require multiple rounds of input (interactive git, debuggers)
Produce output over time (build processes, test suites)
Need to be inspected or stopped mid-execution

Today, if a tool takes too long or needs input, it hangs. There is no way for the assistant to check on a running tool, send it input, or stop it. The only escape hatch is the inquiry system, which re-executes the tool from scratch with accumulated answers — a pattern that doesn't work for tools with genuine long-running state.

We want a model where:

Any tool can run asynchronously without blocking the conversation.
The assistant can spawn a tool, check its state, send input, and stop it.
One-shot tools work exactly as they do today — no user-visible change.
The same protocol supports both simple commands and interactive programs.

Concrete example

Consider a git tool that supports interactive staging:

txt

assistant: call(git, { action: "spawn", id: "staging", command: "stage", args: ["--patch"] })
  → { "id": "staging", "state": "running", "content": "diff --git a/..." }

assistant: call(git, { action: "fetch", id: "staging" })
  → { "id": "staging", "state": "running", "content": "Stage hunk? [y/n/...]" }

assistant: call(git, { action: "apply", id: "staging", input: "y" })
  → { "id": "staging", "state": "running", "content": "Next hunk: ..." }

assistant: call(git, { action: "fetch", id: "staging" })
  → { "id": "staging", "state": "stopped", "result": "Staged 3 hunks." }

Today this is impossible. The git tool can only run non-interactive commands. With the stateful protocol, the tool author builds a git tool that uses JP's infrastructure to manage the interactive subprocess, and the assistant drives the workflow through the standard tool call interface.

Toward sub-agents

The stateful tool protocol is also a stepping stone toward sub-agent capabilities in JP, where the main assistant spawns sub-assistants to perform tasks concurrently. A sub-agent is, from the protocol’s perspective, just another stateful tool: it is spawned, produces output over time, accepts input, and eventually stops. The handle registry, action-based schema, and assistant-driven polling model all apply directly. This RFD does not propose sub-agents, but the infrastructure it introduces is designed to support them.

Design

`ToolState` — what a tool produces

ToolState replaces the current Outcome enum. It represents the state of a tool at any point in its lifecycle:

rust

#[derive(Debug, PartialEq, Serialize, Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ToolState {
    /// The tool is running. Contains optional output produced so far,
    /// including any initial output from startup.
    Running { content: Option<String> },

    /// The tool is paused, waiting for structured input.
    ///
    /// This maps to the existing inquiry system. `Apply` delivers
    /// the answer to the pending question.
    Waiting { content: Option<String>, question: Question },

    /// The tool has stopped (successfully or with an error).
    Stopped { result: Result<String, ToolError> },
}

#[derive(Debug, PartialEq, Serialize, Deserialize)]
pub struct ToolError {
    pub message: String,
    pub trace: Vec<String>,
    pub transient: bool,
}

All states carry an optional content: String field. The tool author controls what goes in this string — they might format it as plain text, JSON, XML, or combine stdout/stderr however they see fit.

Running includes content from the first response onward, including any initial output from startup (a shell prompt, an editor screen, a welcome message). This avoids an extra round-trip to get the initial content.

Stopped returns either a successful output as a string, or an error message containing an optional trace and a boolean indicating whether the error is transient (e.g. a network error), and can thus be retried.

Why `content` is a string, not a generic or `Value`

Structured data that JP needs to act on (a Question in Waiting, a ToolError in Stopped) is carried in dedicated typed fields on each variant, not inside content. The content field is opaque output intended for the assistant — JP passes it through without parsing it.

This means future variants can carry additional typed fields without changing content's type. For example, a sub-agent tool that surfaces an inquiry would return Waiting { question, .. } with the Question in the typed field. The sub-agent serializes its internal state as JSON over the wire; JP deserializes it into the typed ToolState variants on the other side — the same pattern that local tools already use when they write Outcome JSON to stdout.

Making content into serde_json::Value or making ToolState generic over T: Serialize was considered but rejected: Value forces every consumer to parse, and generics infect every containing type with a type parameter. In practice, content is almost always a string. If a richer wire format is needed for sub-agents, that can be addressed in the sub-agent RFD by evolving the variants, or by changing content's type.

State transitions

txt

Running → Stopped
   ↓
 Waiting → Running (after Apply)
   ↓
 Stopped (if aborted while waiting)

Running is the initial state for a stateful tool. It means the tool is active and may contain output. Waiting means the tool needs structured input (a Question) before it can continue — this maps to the existing inquiry system. Stopped is terminal — the tool has exited, either successfully (Ok(content)) or with an error (Err(...)).

`ToolCommand` — what JP dispatches internally

When the assistant calls a tool, JP parses the arguments and produces a ToolCommand:

rust

pub enum ToolCommand {
    /// Spawn a new tool process with an assistant-chosen handle ID.
    Spawn { id: String, args: Map<String, Value> },

    /// Read the current state of a running tool.
    Fetch { id: String },

    /// Send input to a running or waiting tool.
    Apply { id: String, input: Value },

    /// Terminate a running tool.
    Abort { id: String },
}

For one-shot tools (no action field in arguments), JP synthesizes a Spawn followed by a blocking Fetch loop until the tool reaches Stopped. The assistant never sees intermediate states.

For stateful tools (the assistant includes an action field), JP dispatches the command directly and returns the resulting ToolState to the assistant.

`Apply` serves two purposes

Apply { input } is used for both:

Sending data to a Running tool. The input value (typically a string) is written to the tool's stdin or input mechanism. For example, sending "y\n" to a git add --patch session.
Answering a Question from a Waiting tool. The input value (a boolean, string, or other typed value) is delivered as the answer to the pending question. This is the same data that the inquiry system or user prompt would produce.

The tool handle's internal state determines how Apply is routed:

Running handle: input is written to stdin. No question correlation needed.
Waiting handle: JP extracts the question.id from the handle's Waiting { question, .. } state and maps the answer: accumulated_answers[question.id] = input. This is unambiguous because a handle has at most one pending question — a tool can only be in one state at a time.

The caller (assistant or JP's internal one-shot wrapper) does not need to include a question ID in the Apply command. JP derives it from the handle's current state, preserving the same correlation the existing inquiry system provides through explicit InquiryId matching.

For one-shot shell tools in the Waiting state, Apply triggers a re-execution with accumulated answers (preserving the existing NeedsInput behavior). For stateful tools, Apply delivers the input to the still-running process.

Per-tool action sets

Not every stateful tool supports every action. A tool declares which actions it supports, and only those appear in the schema exposed to the assistant:

Tool	Actions	Why
`git` (interactive)	`spawn`, `fetch`, `apply`, `abort`	Needs input for interactive staging
`cargo_check` (async)	`spawn`, `fetch`, `abort`	No stdin, just poll for completion
Background task	`spawn`, `fetch`	Fire and forget, can’t be stopped

The assistant cannot call an action that the tool doesn’t expose — it’s not in the schema. This avoids the question of what happens when apply is sent to a tool that doesn’t accept input: that case simply cannot arise.

Mapping from current `Outcome`

Current `Outcome`	New `ToolState`
`Success { content }`	`Stopped { result: Ok(_) }`
`Error { message, trace, transient }`	`Stopped { Result: Err(_) }`
`NeedsInput { question }`	`Waiting { content: "", question }`

How the assistant interacts with stateful tools

A stateful tool exposes action and id in its JSON schema, alongside its own tool-specific parameters. The schema uses oneOf so each action variant has its own sub-schema. Only the actions the tool supports are included.

Example schema for git (supports spawn, fetch, apply, abort):

json

{
  "type": "object",
  "oneOf": [
    {
      "properties": {
        "action": {
          "const": "spawn"
        },
        "id": {
          "type": "string",
          "description": "Handle ID for this tool instance. Must be unique across active handles."
        },
        "command": {
          "enum": [
            "stage",
            "commit",
            "rebase"
          ]
        },
        "args": {
          "type": "array",
          "items": {
            "type": "string"
          }
        }
      },
      "required": [
        "action",
        "id",
        "command"
      ]
    },
    {
      "properties": {
        "action": {
          "const": "fetch"
        },
        "id": {
          "type": "string"
        }
      },
      "required": [
        "action",
        "id"
      ]
    },
    {
      "properties": {
        "action": {
          "const": "apply"
        },
        "id": {
          "type": "string"
        },
        "input": {}
      },
      "required": [
        "action",
        "id",
        "input"
      ]
    },
    {
      "properties": {
        "action": {
          "const": "abort"
        },
        "id": {
          "type": "string"
        }
      },
      "required": [
        "action",
        "id"
      ]
    }
  ]
}

Example schema for cargo_check (supports spawn, fetch, abort — no apply because it doesn’t accept input):

json

{
  "type": "object",
  "oneOf": [
    {
      "properties": {
        "action": {
          "const": "spawn"
        },
        "id": {
          "type": "string",
          "description": "Handle ID for this tool instance. Must be unique across active handles."
        },
        "package": {
          "type": "string"
        }
      },
      "required": [
        "action",
        "id"
      ]
    },
    {
      "properties": {
        "action": {
          "const": "fetch"
        },
        "id": {
          "type": "string"
        }
      },
      "required": [
        "action",
        "id"
      ]
    },
    {
      "properties": {
        "action": {
          "const": "abort"
        },
        "id": {
          "type": "string"
        }
      },
      "required": [
        "action",
        "id"
      ]
    }
  ]
}

The tool author defines the spawn variant’s parameters and which actions are supported. The fetch, apply, and abort variants follow a fixed pattern. JP’s SDK generates the full schema from the tool’s definition for tools written in Rust. Other languages may need to generate the schema manually.

Handle registry

JP maintains a handle registry that maps handle IDs to running tool instances. The registry lives in the query execution loop and persists across tool call rounds within a turn.

rust

struct HandleRegistry {
    handles: HashMap<String, ToolHandle>,
}

struct ToolHandle {
    tool_name: String,
    state: ToolState,
    // Opaque handle to the running tool — the tool implementation
    // decides what this contains (a child process, a PTY session,
    // an async task, etc.)
    inner: Box<dyn AnyToolHandle>,
}

Handle IDs are chosen by the assistant via a required id parameter on the spawn action. The assistant picks descriptive names (check, staging, build_release) that appear in the conversation history and should be readable. JP validates uniqueness — if the chosen ID collides with an already-active handle, the spawn returns an error.

Model-chosen IDs are preferable to JP-generated sequential IDs (h_1, h_2) because they are meaningful in conversation history and the assistant knows them at call time without waiting for JP's response.

When the assistant calls tool(action: "fetch", id: "staging"), JP looks up staging in the registry, calls the tool's fetch handler, and returns the result.

Handle lifecycle

Handles are created when a tool returns a non-terminal state and destroyed when:

The tool reaches Stopped (natural completion)
The assistant or user sends Abort
The conversation turn ends (all remaining handles are aborted)
TIP
RFD 037 extends turn-end behavior with configurable per-tool policies (inquire, await, abort).
JP exits (child processes receive SIGTERM, then SIGKILL)

One-shot tool wrapping

For tools that don't declare stateful support, JP wraps the existing execution in the stateful protocol:

txt

assistant: call(cargo_check, { package: "jp_cli" })

JP internally:
  1. No `action` in args → treat as one-shot
  2. Spawn tool → internally creates handle
  3. Loop: fetch handle state
     - Running → wait, continue
     - Waiting → handle via inquiry/prompt (existing NeedsInput flow)
     - Stopped → return result to assistant
  4. Destroy handle
  5. Return ToolCallResponse to assistant as before

assistant gets: ToolCallResponse { result: "ok, 0 warnings" }

The assistant sees no difference. The StreamEventHandler.handle_tool_call function wraps the existing call in this loop. The NeedsInput -> re-execute pattern maps to: tool returns Waiting, JP collects the answer (prompt or inquiry), sends Apply, tool continues.

For shell-based tools that exit immediately, the internal flow is: spawn process → process exits → parse stdout as Outcome → convert to ToolState::Stopped → return. The handle exists for microseconds.

Detecting stateful vs. one-shot tools

A tool is stateful if the assistant invokes it using the stateful protocol — that is, with an action field in its arguments. Detection is based on the invocation, not the return value.

action field present → stateful. JP dispatches the ToolCommand directly. If the action is spawn, JP registers a handle for the tool. Subsequent fetch/apply/abort calls reference the handle by ID.
No action field → one-shot. JP wraps the call in the internal spawn/fetch loop. If the tool returns Waiting (e.g., fs_modify_file asking about a dirty file), JP handles it via the existing inquiry/prompt path and re-executes with accumulated answers. The tool is never registered as stateful.

This distinction matters because a one-shot shell tool that returns Waiting has already exited — there is no running process to send Apply to. The Waiting state in the one-shot path means "re-execute with answers" (the current NeedsInput behavior), while Waiting in the stateful path means "the process is alive and paused, deliver input via Apply."

Existing tools like fs_modify_file require no changes. They don't declare stateful actions in their schema, so the assistant never sends an action field, and JP runs them through the one-shot path exactly as today.

For the JSON schema to include the action/id fields, the tool's configuration or definition must indicate stateful support. This could be:

A flag in the tool config (stateful = true)
The tool definition including action in its parameters
Built-in tools declaring it programmatically

The exact mechanism for schema declaration is a detail to resolve during implementation.

Integration with existing tool system

The stateful protocol slots into the existing execution flow at the StreamEventHandler.handle_tool_call level. Today this method:

Resolves the tool config and definition
Handles run mode (ask/unattended/edit/skip)
Calls ToolDefinition::call in a loop (retrying on NeedsInput)
Collects ToolCallResponse

With the stateful protocol, step 3 changes:

One-shot tools: Same loop, but internally modeled as spawn → fetch → stopped. The NeedsInput retry loop maps to the Waiting → Apply cycle.
Stateful tools: No loop. JP dispatches the ToolCommand, returns the ToolState to the assistant, and the assistant drives subsequent calls.

The ToolCallResponse sent to the assistant for stateful tools includes the handle ID and state:

rust

ToolCallResponse {
    id: call.id,
    result: Ok(json!({
        "id": "staging",
        "state": "running",
        "content": "Stage this hunk? [y/n/...]"
    }).to_string()),
}

The conversation event stream (ToolCallRequest → ToolCallResponse) is unchanged. Each action call is a separate tool call round from the assistant's perspective.

Interaction with the inquiry system

The current inquiry system handles NeedsInput with QuestionTarget::Assistant by making a structured LLM call to get the answer, then re-executing the tool. Under the stateful protocol, the inquiry system produces the answer the same way, but delivers it via Apply:

Tool returns Waiting { question, .. } (with target: Assistant in config)
JP's inquiry system makes the structured LLM call, extracts the answer (unchanged)
JP sends Apply { input: answer } to the tool handle
For one-shot shell tools: Apply triggers re-execution with accumulated answers (existing behavior — the process has already exited)
For stateful tools: Apply delivers the answer to the still-running process

When the assistant drives a stateful tool directly (no inquiry system):

Tool returns Waiting { question, .. } → JP returns it to the assistant as part of the ToolCallResponse
The assistant calls tool(action: "apply", id: "staging", input: true)
JP sends Apply { input: true } to the handle
The tool answers the question and continues

Both paths use the same Apply command. The tool handle routes the input based on its internal state (stdin write vs. question answer).

`ToolState` in `jp_tool`

The Outcome type in jp_tool is the public API that tool authors use. The migration path:

Add ToolState alongside Outcome in jp_tool
Implement From<Outcome> for ToolState for backward compatibility
Update internal code to use ToolState
Deprecate Outcome (but keep accepting its JSON format from shell tools)

rust

impl From<Outcome> for ToolState {
    fn from(outcome: Outcome) -> Self {
        match outcome {
            Outcome::Success { content } => ToolState::Stopped {
                content,
                error: None,
            },
            Outcome::Error { message, trace, transient } => ToolState::Stopped {
                content: String::new(),
                error: Some(ToolError { message, trace, transient }),
            },
            Outcome::NeedsInput { question } => ToolState::Waiting {
                content: String::new(),
                question,
            },
        }
    }
}

Drawbacks

Complexity. The unified state machine adds a layer of abstraction over what is currently a simple call-and-return model. For the majority of tools that are one-shot, this abstraction provides no user-visible benefit — it's purely internal.

Handle management. Long-lived handles introduce state that must be tracked, cleaned up on errors, and survived across tool call rounds. This is new territory for JP's tool system, which currently has no cross-round state beyond TurnState.pending_tool_call_questions.

Schema complexity. The oneOf schema for stateful tools is more complex than a flat parameter list. Some assistant providers handle oneOf schemas poorly or not at all. This may require provider-specific schema transformations.

Two execution paths. Despite the unified model, one-shot and stateful tools still have different code paths (wrap-and-loop vs. direct dispatch). The abstraction unifies the types, not the implementation.

Alternatives

Keep tools strictly one-shot, add separate "session" tools

This was the original approach in the first draft of this RFD — expose PTY sessions as a set of generic terminal_* tools. The assistant would call terminal_start("git add --patch") to launch any interactive program.

Rejected because: It exposes the wrong abstraction. The assistant should call domain tools (git, editor), not generic terminal tools. The stateful protocol lets each tool define its own commands and parameters while reusing JP's handle management infrastructure.

Make statefulness a tool config property only

Require stateful = true in the tool configuration, rather than inferring it from the tool's return value.

Rejected because: It adds configuration burden. However, some form of declaration is needed for schema generation — the assistant needs to see the action/id parameters in the tool's schema to know it supports the stateful protocol. The rejected approach is making config the only mechanism. In the proposed design, the schema declaration drives both schema generation and runtime dispatch (presence of action in arguments).

Separate `ToolState` from `Outcome` entirely

Don't provide From<Outcome> for ToolState. Make them independent types with no conversion path.

Rejected because: It would break every existing tool immediately. The conversion preserves backward compatibility — existing tools that output Outcome JSON continue to work.

Non-Goals

PTY / terminal emulation. This RFD covers the protocol and handle management. How a tool actually manages an interactive subprocess (PTY, terminal emulator, screen buffer) is covered in a follow-up RFD.
Interactive tool SDK. A convenience SDK (jp_tool::interactive) for building stateful tools in Rust is a follow-up. This RFD defines the protocol that such an SDK would target.
Parallel stateful tools. Multiple handles can coexist, but this RFD does not address coordinating between them (e.g., synchronizing on multiple handles, piping output from one to another).
TIP
RFD 037 introduces an await built-in tool for cross-handle synchronization.
Persistent handles across JP invocations. Handles are scoped to the JP process. When JP exits, all handles are destroyed.

Risks and Open Questions

Schema generation for stateful tools

The oneOf schema pattern for action variants is not universally supported by LLM providers. Google's Gemini supports anyOf but strictly requires it to be the only field in the schema object — no sibling properties are allowed alongside it. Other providers may have their own limitations. We may need to fall back to a flat schema with optional fields and use description to explain the action protocol, or apply per-provider schema transformations. This needs validation with each provider.

Handle cleanup guarantees

If JP crashes (SIGKILL), handles are not cleaned up. Child processes become orphans. This is the same risk as the current subprocess model, but stateful tools are more likely to have long-running processes that leave visible state (lock files, half-written files). We should document this and consider a cleanup-on-startup mechanism (e.g., a .jp/handles.lock file).

Token cost of stateful interactions

Each fetch / apply round is a full LLM tool call round-trip. A 20-step interactive session means 20 tool call rounds, each adding input/output tokens. For tools that produce large output (full terminal screens), this adds up. This RFD does not address token optimization — that's a concern for the tool implementation (e.g., returning diffs instead of full screens).

Interaction with tool permission system

The current RunMode::Ask prompts the user before each tool call. For stateful tools, this would mean prompting before every fetch and apply, which would make interactive workflows unusable. The recommendation is: prompt on spawn, run fetch/apply/abort unattended. But this should be configurable per action, not just per tool.

Proactive delivery of stopped handles

When a stateful tool reaches Stopped without the assistant having asked for its state (e.g., cargo check finishes while the assistant is doing other work), how does the assistant learn about it?

The current event model requires ToolCallRequest → ToolCallResponse pairs. JP cannot inject a response without a corresponding request. Options considered:

Require the assistant to poll. Simple but fragile — the assistant must remember to fetch every handle it spawned.
Fabricate a ToolCallRequest/ToolCallResponse pair. Violates the event model contract.
Deliver at turn end. The turn is already over — the assistant has responded. The result would only be available in the next turn.

None of these are clean. The recommended approach for this RFD is assistant-driven polling: the assistant is responsible for checking on its handles. JP does not proactively push results.

A better solution may be a general-purpose system message queue that delivers notifications piggybacked on existing messages. This would allow JP to inform the assistant of handle state changes without fabricating events. Such a mechanism could also deliver other system events (MCP disconnections, workspace changes) and would be configurable per-tool. This is left as a follow-up design.

TIP

RFD 011 introduces the system message queue. RFD 037 introduces the await tool for explicit synchronization and configurable per-tool notifications and turn-end behavior.

Backward compatibility of `ToolState` serialization

The ToolState JSON format (with "type": "spawned" etc.) is different from the current Outcome format (with "type": "success" etc.). JP needs to accept both formats from tool stdout. The detection heuristic: if the type field is one of spawned, running, waiting, stopped, parse as ToolState; if it's success, error, needs_input, parse as Outcome and convert.

Implementation Plan

Phase 1: `ToolState` type in `jp_tool`

Add the ToolState and ToolError types alongside existing Outcome. Add From<Outcome> for ToolState. Add the backward-compatible deserialization that accepts both formats. Unit tests for conversion and serialization.

Can be merged independently. No behavioral changes.

Phase 2: `ToolCommand` type and handle registry

Add ToolCommand to jp_tool. Create a HandleRegistry struct (likely in a new jp_tool::handle module or in jp_cli). Define the AnyToolHandle trait that tool implementations will implement.

Can be merged independently. No behavioral changes yet — the types exist but aren't used.

Phase 3: One-shot tool wrapping

Refactor StreamEventHandler.handle_tool_call to use the stateful protocol internally. Replace the NeedsInput retry loop with the spawn → fetch → apply cycle. The ToolDefinition::call function returns ToolState instead of ToolCallResponse. The wrapping loop converts ToolState::Stopped to ToolCallResponse.

This is the critical integration phase. Existing behavior must be preserved exactly. Extensive testing against the existing tool call test cases.

Depends on Phase 1 and 2.

Phase 4: Stateful tool dispatch

Add the action / id argument parsing. When a tool call includes action, JP dispatches the ToolCommand to the handle registry instead of wrapping in a one-shot loop. Return ToolState as the tool call response content.

Depends on Phase 3.

Phase 5: Schema generation for stateful tools

Add utilities to generate the oneOf schema from a tool's declared action set. Integrate with ToolDefinition::to_parameters_schema. Handle provider-specific schema limitations.

Depends on Phase 4. Can be iterated on independently.

References

RFD 028: Structured Inquiry System — the inquiry system that handles Waiting with QuestionTarget::Assistant.
Query Stream Pipeline — the turn loop and tool execution flow that this RFD modifies.
#392 — PTY-based end-to-end CLI testing (related infrastructure).
interminai — prior art for PTY-based tool interaction, validates the approach.

RFD 009: Stateful Tool Protocol ​

Summary ​

Motivation ​

Concrete example ​

Toward sub-agents ​

Design ​

ToolState — what a tool produces ​

Why content is a string, not a generic or Value ​

State transitions ​

ToolCommand — what JP dispatches internally ​

Apply serves two purposes ​

Per-tool action sets ​

Mapping from current Outcome ​

How the assistant interacts with stateful tools ​

Handle registry ​

Handle lifecycle ​

One-shot tool wrapping ​

Detecting stateful vs. one-shot tools ​

Integration with existing tool system ​

Interaction with the inquiry system ​

ToolState in jp_tool ​

Drawbacks ​

Alternatives ​

Keep tools strictly one-shot, add separate "session" tools ​

Make statefulness a tool config property only ​

Separate ToolState from Outcome entirely ​

Non-Goals ​

Risks and Open Questions ​

Schema generation for stateful tools ​

Handle cleanup guarantees ​

Token cost of stateful interactions ​

Interaction with tool permission system ​

Proactive delivery of stopped handles ​

Backward compatibility of ToolState serialization ​

Implementation Plan ​

Phase 1: ToolState type in jp_tool ​

Phase 2: ToolCommand type and handle registry ​

Phase 3: One-shot tool wrapping ​

Phase 4: Stateful tool dispatch ​

Phase 5: Schema generation for stateful tools ​

References ​

RFD 009: Stateful Tool Protocol

Summary

Motivation

Concrete example

Toward sub-agents

Design

`ToolState` — what a tool produces

Why `content` is a string, not a generic or `Value`

State transitions

`ToolCommand` — what JP dispatches internally

`Apply` serves two purposes

Per-tool action sets

Mapping from current `Outcome`

How the assistant interacts with stateful tools

Handle registry

Handle lifecycle

One-shot tool wrapping

Detecting stateful vs. one-shot tools

Integration with existing tool system

Interaction with the inquiry system

`ToolState` in `jp_tool`

Drawbacks

Alternatives

Keep tools strictly one-shot, add separate "session" tools

Make statefulness a tool config property only

Separate `ToolState` from `Outcome` entirely

Non-Goals

Risks and Open Questions

Schema generation for stateful tools

Handle cleanup guarantees

Token cost of stateful interactions

Interaction with tool permission system

Proactive delivery of stopped handles

Backward compatibility of `ToolState` serialization

Implementation Plan

Phase 1: `ToolState` type in `jp_tool`

Phase 2: `ToolCommand` type and handle registry

Phase 3: One-shot tool wrapping

Phase 4: Stateful tool dispatch

Phase 5: Schema generation for stateful tools

References