RFD 012: Typed LLM Streaming Events
- Status: Discussion
- Category: Design
- Authors: Jean Mertz git@jeanmertz.com
- Date: 2026-03-15
Summary
This RFD replaces ConversationEvent inside jp_llm::Event::Part with a purpose-built EventPart enum that models what LLM providers actually stream: message chunks, reasoning chunks, structured data chunks, and tool call lifecycle events. This decouples the streaming transport layer from the persistence layer, eliminates invalid states, and prepares the infrastructure for incremental tool call argument streaming.
Motivation
Event::Part currently carries a ConversationEvent:
pub enum Event {
Part { index: usize, event: ConversationEvent },
Flush { index: usize, metadata: IndexMap<String, Value> },
Finished(FinishReason),
}ConversationEvent is a persistence type. It carries a timestamp, a metadata map, and an EventKind that can be any of seven variants: TurnStart, ChatRequest, ChatResponse, ToolCallRequest, ToolCallResponse, InquiryRequest, InquiryResponse.
In the streaming context, only a subset is valid:
| EventKind variant | Used in Event::Part? |
|---|---|
ChatResponse::Message | Yes — message content chunks |
ChatResponse::Reasoning | Yes — reasoning content chunks |
ChatResponse::Structured | Yes — but hacked as Value::String(chunk) |
ToolCallRequest | Yes — but only as a start signal with empty arguments |
TurnStart | No |
ChatRequest | No |
ToolCallResponse | No |
InquiryRequest | No |
InquiryResponse | No |
This creates several problems:
Invalid states are representable. The
EventBuildermust silently ignoreChatRequest,ToolCallResponse,InquiryRequest,InquiryResponse, andTurnStartin itshandle_partmethod. These should be unrepresentable at the type level.Structured data is a hack. Structured response chunks piggy-back on
ChatResponse::Structured { data: Value::String(chunk) }— a persistence type abused as transport. This has caused bugs where consumers process structured data chunks as regular message content.Tool call starts are overloaded. When a tool call begins, the provider emits a
ToolCallRequestwith empty arguments. This is a persistence type pretending to be a "streaming has started" signal.Wasted allocations. Every streaming chunk allocates a
ConversationEventwith a timestamp and metadata map that nobody reads during streaming.Growing
ConversationEvent. As the persistence type grows (new event kinds, richer metadata), every streaming chunk pays the cost even though it only uses a small fraction of the type.
Design
New Event structure
pub enum Event {
/// Streaming data for a given index, accumulated until Flush.
Part {
index: usize,
part: EventPart,
/// Metadata accumulated during streaming (e.g. thinking signatures).
/// Usually empty.
metadata: Map<String, Value>,
},
/// Flush accumulated Parts for this index into a ConversationEvent.
Flush {
index: usize,
metadata: IndexMap<String, Value>,
},
/// The response stream is finished.
Finished(FinishReason),
}EventPart
/// A chunk of streaming data from an LLM provider.
///
/// Each variant maps to a distinct content type that providers
/// differentiate between. The EventBuilder accumulates these into
/// ConversationEvents on Flush.
pub enum EventPart {
/// A chunk of assistant message content.
Message(String),
/// A chunk of reasoning/thinking content.
Reasoning(String),
/// A chunk of structured response JSON.
Structured(String),
/// Tool call streaming data.
ToolCall(ToolCallPart),
}ToolCallPart
/// Streaming events for a single tool call.
pub enum ToolCallPart {
/// Tool call identity. First non-empty value wins per field when
/// multiple Start events arrive for the same index.
Start {
id: String,
name: String,
},
/// A raw JSON chunk of tool call arguments.
ArgumentChunk(String),
}EventBuilder translation
The EventBuilder becomes an explicit translator between the streaming domain (EventPart) and the persistence domain (ConversationEvent):
| EventPart received | Buffer action | ConversationEvent on Flush |
|---|---|---|
Message(s) | Append to string buffer | ChatResponse::Message { message } |
Reasoning(s) | Append to string buffer | ChatResponse::Reasoning { reasoning } |
Structured(s) | Append to JSON string buffer | Parse JSON → ChatResponse::Structured { data } |
ToolCall(Start { id, name }) | Initialize tool call buffer | — |
ToolCall(ArgumentChunk(s)) | Feed ToolCallRequestAggregator | ToolCallRequest { id, name, arguments } |
The ToolCallRequestAggregator remains unchanged — it buffers raw JSON argument chunks and parses them on finalize, as today. This RFD only changes the transport; it does not change how arguments are parsed.
Metadata handling
Streaming metadata (e.g. Anthropic's thinking signatures via SignatureDelta) currently piggy-backs on ConversationEvent's metadata map. With EventPart, metadata moves to a field on Event::Part:
Event::Part {
index: 0,
part: EventPart::Reasoning("thinking...".into()),
metadata: Map::new(), // usually empty; populated for signature deltas
}The EventBuilder accumulates Part-level metadata the same way it does today — merging it into the final ConversationEvent's metadata on Flush.
Provider changes
Each provider's stream mapper changes from constructing ConversationEvents to constructing EventParts. This is a mechanical transformation:
Before (Anthropic example):
// Message content
Event::Part {
index,
event: ConversationEvent::now(ChatResponse::Message {
message: text.clone(),
}),
}
// Tool call start
Event::Part {
index,
event: ConversationEvent::now(ToolCallRequest {
id: id.clone(),
name: name.clone(),
arguments: Map::new(),
}),
}
// Structured data chunk
Event::Part {
index,
event: ConversationEvent::now(ChatResponse::Structured {
data: Value::String(chunk),
}),
}After:
// Message content
Event::Part {
index,
part: EventPart::Message(text.clone()),
metadata: Map::new(),
}
// Tool call start
Event::Part {
index,
part: EventPart::ToolCall(ToolCallPart::Start {
id: id.clone(),
name: name.clone(),
}),
metadata: Map::new(),
}
// Structured data chunk
Event::Part {
index,
part: EventPart::Structured(chunk),
metadata: Map::new(),
}TurnCoordinator changes
The TurnCoordinator currently inspects ConversationEvent variants inside Event::Part to decide what to render. With EventPart, it matches directly on the streaming types:
Event::Part { index, part, metadata } => {
match &part {
EventPart::Message(text) => {
self.chat_renderer.render_message(text);
}
EventPart::Reasoning(text) => {
self.chat_renderer.render_reasoning(text);
}
EventPart::Structured(chunk) => {
self.structured_renderer.render_chunk(chunk);
}
EventPart::ToolCall(ToolCallPart::Start { id, name }) => {
self.chat_renderer.flush();
// register tool call for preparing display
}
EventPart::ToolCall(ToolCallPart::ArgumentChunk(_)) => {
// forwarded to EventBuilder only; no rendering yet
}
}
self.event_builder.handle_part(index, part, metadata);
}Backwards compatibility
ConversationEventis unchanged — it remains the persistence type used byConversationStream.- The
ConversationStreamis not affected. Event::FlushandEvent::Finishedare unchanged.- The
Event::Partvariant changes its inner type. All consumers ofEvent::Partmust be updated (providers, EventBuilder, TurnCoordinator). This is a breaking change withinjp_llm.
What does NOT change
ConversationEventandEventKind(persistence types)ConversationStream(persistence layer)ToolCallRequestAggregator(argument parsing — see RFD 048 for its replacement)- The query stream pipeline's Part > Flush > Finished lifecycle
Drawbacks
Breaking change to
Event. Every provider and every consumer ofEvent::Partmust be updated. This is a mechanical transformation but touches many files.Two type hierarchies for the same concepts.
EventPart::MessageandChatResponse::Messagerepresent the same concept (assistant message content) at different layers. This is intentional — streaming and persistence have different concerns — but it means the EventBuilder translates between parallel hierarchies.
Alternatives
Keep ConversationEvent in Event::Part, add ValueChunk variant
Add Event::ValueChunk { index, chunk } alongside the existing Event::Part for raw JSON chunks (tool call arguments, structured responses). This is the minimal change needed for incremental argument streaming.
Rejected because:
Event::Partcontinues to carry invalid variants (ChatRequest,TurnStart, etc.) that the EventBuilder silently ignores.- Structured data still piggy-backs on
ChatResponse::StructuredwithValue::String(chunk)— the hack that has caused bugs. - Tool call starts still overload
ToolCallRequestwith empty arguments. - The streaming and persistence layers remain coupled.
EventPart with a generic Json variant instead of Structured + ToolCall
Use EventPart::Json(String) for both structured response chunks and tool call argument chunks, routing by index buffer type in EventBuilder.
Rejected because:
- The EventBuilder already routes by type (message vs reasoning vs structured vs tool call). Having typed variants at the EventPart level is consistent with this routing.
Json(String)loses the tool call identity information thatToolCall(Start { id, name })provides.- Two different streaming use cases (structured data, tool call arguments) have different downstream handling — typed variants make this explicit.
Non-Goals
Incremental tool call argument streaming. This RFD keeps the
ToolCallRequestAggregatorin the EventBuilder for argument parsing. TheArgumentChunkvariant forwards raw JSON to the aggregator, same as today's behavior but with cleaner transport.Changing ConversationEvent. The persistence type is unchanged. This RFD only affects the streaming transport layer.
Risks and Open Questions
Provider-specific streaming patterns. Some providers may stream tool call
idandnameacross multiple deltas rather than in a single start event. TheToolCallPart::Startdesign handles this via first-non-empty-wins semantics (same as today'smerge_tool_call), but this needs validation with all supported providers (Anthropic, OpenRouter, Llamacpp, Ollama, Google).Metadata edge cases. Moving metadata from
ConversationEventtoEvent::Partshould be straightforward, but any provider-specific metadata patterns need testing.
Implementation Plan
Phase 1: Define EventPart types
Add EventPart, ToolCallPart to jp_llm. These are pure types with no behavioral changes.
Phase 2: Update EventBuilder
Change EventBuilder::handle_part to accept EventPart instead of ConversationEvent. Update the IndexBuffer creation and accumulation logic. The output (flushed ConversationEvents) is identical.
Phase 3: Update providers
Update Anthropic, OpenRouter, Llamacpp, Ollama, and Google providers to emit EventPart variants instead of constructing ConversationEvents. Remove ConversationEvent construction from the provider layer entirely.
Move the ToolCallRequestAggregator from the provider layer into the EventBuilder — providers now emit ToolCall(ArgumentChunk(chunk)) instead of buffering chunks themselves.
Phase 4: Update TurnCoordinator
Update TurnCoordinator::handle_streaming_event to match on EventPart instead of ConversationEvent. Update the chat renderer, structured renderer, and tool renderer integration points.
Phase 5: Clean up
Remove Event's as_conversation_event(), into_conversation_event(), and related helper methods that assume Part carries a ConversationEvent. Add equivalent helpers for EventPart if needed.
References
crates/jp_llm/src/event.rs— currentEventtypecrates/jp_conversation/src/event.rs—ConversationEvent,EventKindcrates/jp_conversation/src/event_builder.rs—EventBuildercrates/jp_cli/src/cmd/query/turn/coordinator.rs—TurnCoordinatorcrates/jp_llm/src/stream/aggregator/tool_call_request.rs—ToolCallRequestAggregator