RFD 064: Non-Destructive Conversation Compaction

Status: Implemented
Category: Design
Authors: Jean Mertz git@jeanmertz.com
Date: 2025-07-17
Supersedes: RFD 036

Summary

This RFD introduces conversation compaction as a non-destructive, additive operation. Instead of mutating or deleting events in the stored conversation, compaction appends overlay events that instruct the projection layer to present a reduced view when building the LLM request. The original events are always preserved. Compaction policies are defined per content type (summary, reasoning, tool calls), composed across multiple compaction events, and configured at the workspace and conversation level.

Motivation

Long-running conversations degrade LLM performance. Research confirms that when models take a wrong turn early in a conversation, they don't recover (see: Issue #57). Even when the conversation stays on track, growing context windows cause:

Higher cost. Every cached and uncached input token is billed. Tool call responses — file contents, grep results, test output — dominate the token count in coding conversations.
Slower responses. More input tokens means higher time-to-first-token.
Lower quality. Models lose focus in long contexts. Obsolete tool results and abandoned tangents actively mislead the model.
Context window overflow. Eventually the conversation exceeds the model's window and fails outright.

Today, users work around this by forking the last turn (jp conversation fork --last 1) and losing all prior context. This is effective but blunt — it discards useful context along with the noise.

JP needs a way to selectively reduce conversation size while preserving the context that matters. Multiple existing RFDs defer to this one:

RFD 011 (System Notification Queue): "If JP ever implements conversation compaction..."
RFD 034 (Inquiry Config): "smarter compaction (summarization, middle-out trimming) is orthogonal"

RFD 036 proposed compaction as a destructive transformation of the ConversationStream — strategies that mutate or replace events, with fork-by-default as a safety net. This RFD supersedes that design with a non-destructive approach: compaction events are appended to the stream and define a projection of the original events, preserving the full history while presenting a reduced view to the LLM.

Design

Core Concept: Compaction as Overlay

A compaction event is an InternalEvent variant — like ConfigDelta — that modifies how preceding events are interpreted when building the LLM request. It does not modify or delete any existing events.

InternalEvent::ConfigDelta  → "from here on, use this config"
InternalEvent::Compaction   → "when building the LLM view, apply these
                               policies to events in this range"

The original events remain in events.json. The projection layer in Thread::into_parts() reads all compaction events, builds a projection plan, and yields the appropriate view to the provider.

User-Facing Behavior

The `compact` Command

jp conversation compact [ID] [OPTIONS]

Compacts the active conversation (or the specified one) by appending one or more compaction events based on the configured rules. The original events are untouched.

# Compact with workspace defaults
jp conversation compact

# Compact with overrides from a config file
jp -c compaction/heavy conversation compact

# Override range via flags
jp conversation compact --keep-last 5

# Compact a specific range
jp conversation compact --from 5h --to 1h

# Strip only reasoning
jp conversation compact --reasoning

# Preview what would change
jp conversation compact --dry-run

# Remove all compaction events (undo)
jp conversation compact --reset

Flags:

Flag	Default	Description
`--keep-first <N>`	from config	Preserve the first N turns.
`--keep-last <N>`	from config	Preserve the last N turns.
`--from <bound>`	start of conversation	Start of the compacted range
		(inclusive). Overrides `--keep-first`.
`--to <bound>`	end of conversation	End of the compacted range (inclusive).
		Overrides `--keep-last`.
`--reasoning`	from config	Strip reasoning (thinking) blocks.
`--tools`	from config	Strip tool call arguments/responses.
`--summarize`	from config	Generate an LLM summary for the range.
`--dry-run`	`false`	Preview effects without applying.
`--reset`	`false`	Remove all compaction events from the
		stream.

Range bounds accept several formats:

Value	Example	Meaning
Positive integer	`--from 5`	Absolute turn index (0-based).
Negative integer	`--to -3`	3 turns before the last turn.
Duration string	`--from 5h`	Time ago (resolved to a turn index).
`last`	`--from last`	Turn of the most recent compaction
		event, or start if none.

--from requires a value (use --from last-compaction for the most recent compaction). All bounds are resolved to absolute turn indices at creation time and stored as integers.

NOTE

Turn positions on the CLI are now 1-based: --from 1 is the first turn and --to -1 is the last. The stored indices remain 0-based. The two 5/-3 examples above describe the original 0-based behavior; see [Indexing and Counting Conventions] for the current rule.

The last keyword is now spelled last-compaction (with last kept as a deprecated alias) and is accepted only for --from.

--reset removes all InternalEvent::Compaction variants from the stream, restoring the raw event history. The projection layer then has nothing to apply, and the LLM sees the original uncompacted events. This is useful for undoing compaction when the result is unsatisfactory.

The `--compact` Flag (DSL)

The --compact flag is available on query, fork, and compact itself. It supports two forms:

Bare --compact (no value): apply the compaction rules from the resolved configuration.
--compact=SPEC (with a DSL value): apply an inline compaction rule. Each --compact=SPEC adds one compaction event.

Both forms compose: bare --compact includes config rules, and each --compact=SPEC adds a DSL rule. When only --compact=SPEC is present (no bare --compact), config rules are not included — only the explicit DSL rules apply.

The short flag is -k.

# Apply config rules, then query
jp query --compact -- "Continue working on the feature"

# Apply config rules via short flag
jp query -k -- "Continue"

# Inline DSL: summarize all but last 3, then query
jp query -k s:..-3 -- "Continue"

# Two inline rules on fork
jp conversation fork -k r:..-20 -k s:..-3

# Mix config rules + inline rule
jp query --compact -k s:..-1 -- "Continue"

DSL Grammar

SPEC     = POLICIES [":" RANGE]
POLICIES = POLICY ["+" POLICY]*
POLICY   = "r" | "reasoning"
         | "t" | "tools"
         | "s" | "summarize"
RANGE    = [BOUND] ".." [BOUND]    # explicit range (at least "..")
         | BOUND                    # single-bound shorthand
BOUND    = INTEGER                 # >= 0: absolute turn index
         | "-" INTEGER             # <  0: offset from the end

The range describes which turns the policy applies to (consistent with from_turn/to_turn in the Compaction event). Turns outside the range are unaffected. Bounds are Python-slice style: a non-negative number is an absolute, 0-based turn index, and a negative number is an offset from the end (-1 is the last turn). Either end may use either form, and both ends are inclusive. An omitted left bound means the start of the conversation; an omitted right bound means the end.

Range semantics:

Full FROM..TO form:

Syntax	Meaning
`..`	All turns (start to end).
`5..`	Turn 5 through the end (keeps the first 5).
`..-3`	Start through 3-from-end (keeps the last 3).
`5..-3`	Turn 5 through 3-from-end (keeps first 5, last 3).
`5..10`	Absolute turns 5 through 10.
`..10`	Start through absolute turn 10.
`-10..-3`	10-from-end through 3-from-end.

Single-bound shorthands:

Syntax	Expands to	Meaning
`-3`	`..-3`	Keep last 3 uncompacted.
`5`	`5..`	Turn 5 onward (keep first 5).

Examples:

DSL spec	Meaning
`s`	Summarize, range from config defaults.
`r+t`	Strip reasoning + tools, range from config.
`s:..-3`	Summarize all but last 3 turns.
`r+t:..-3`	Strip reasoning + tools, keep last 3.
`s:..`	Summarize all events.
`r:5..`	Strip reasoning from turn 5 onward.
`s:5..-3`	Summarize turns 5 through 3-from-end.
`s:5..10`	Summarize absolute turns 5 through 10.
`s:-3`	Summarize all but last 3 (shorthand).
`r:-20`	Strip reasoning, keep last 20 (shorthand).

When a DSL spec omits the range, the config's keep_first and keep_last defaults are used. When a DSL spec is provided, the policies are self-contained — no policies are inherited from config. The DSL defines the complete rule.

Viewing Compacted Conversations

# Print the full history (default)
jp conversation print

# Print the compacted view (what the LLM sees)
jp conversation print --compacted

Compaction Event Model

The `Compaction` Type

A compaction event defines an explicit range and optional per-content-type policies:

rust

/// A compaction overlay stored in the event stream.
///
/// Defines how events within [from_turn, to_turn] should be projected when
/// building the LLM request.
/// The original events are unmodified.
pub struct Compaction {
    pub timestamp: DateTime<Utc>,

    /// First turn in the compacted range (inclusive, 0-based).
    pub from_turn: usize,

    /// Last turn in the compacted range (inclusive, 0-based).
    pub to_turn: usize,

    /// When set, replaces ALL provider-visible events in the range with a
    /// pre-computed summary.
    /// Takes precedence over `reasoning` and `tool_calls`.
    pub summary: Option<SummaryPolicy>,

    /// Policy for ChatResponse::Reasoning events.
    /// Ignored when `summary` is set.
    pub reasoning: Option<ReasoningPolicy>,

    /// Policy for ToolCallRequest and ToolCallResponse pairs.
    /// Ignored when `summary` is set.
    pub tool_calls: Option<ToolCallPolicy>,
}

None means "this compaction has no opinion on this content type" — the original events pass through, or an earlier compaction's policy applies.

Per-Content-Type Policies

Each content type has its own policy enum, carrying only what makes sense for that type:

rust

pub enum ReasoningPolicy {
    /// Omit all reasoning events from the projected view.
    Strip,
}

/// Replaces ALL provider-visible events in the range with a pre-computed
/// summary.
/// Messages, reasoning, and tool calls are all replaced by a single synthetic
/// ChatRequest/ChatResponse pair.
pub struct SummaryPolicy {
    /// The summary text, generated at compaction-creation time.
    pub summary: String,
}

pub enum ToolCallPolicy {
    /// Replace request arguments and/or response content with compact
    /// summaries.
    /// Preserves tool name, call ID, and success/error status.
    ///
    /// Parses from strings for config ergonomics:
    ///
    /// - `"strip"` → Strip { request: true, response: true }
    /// - `"strip-responses"` → Strip { request: false, response: true }
    /// - `"strip-requests"` → Strip { request: true, response: false }
    ///
    /// Or inline table: `{ policy = "strip", request = true, response = true }`
    Strip {
        /// Replace arguments with a compact summary.
        request: bool,
        /// Replace response content with a status line.
        response: bool,
    },

    /// Remove all tool call pairs entirely.
    Omit,
}

Eagerness Principle

Transformations fall into two categories:

Eager (store the result). Expensive or non-deterministic operations — LLM-generated summaries. The output is stored in the compaction event (SummaryPolicy { summary }) because regenerating it would be costly and potentially different each time.
Lazy (store the policy). Cheap, deterministic operations — stripping reasoning, replacing tool responses with a status line, omitting events. The policy is stored (ToolCallPolicy::StripResponses), and the projection layer applies it at read time.

Integration with `InternalEvent`

The compaction event is a new variant of InternalEvent, alongside ConfigDelta and Event:

rust

pub enum InternalEvent {
    ConfigDelta(ConfigDelta),
    Event(Box<ConversationEvent>),
    Compaction(Compaction),
}

Like ConfigDelta, compaction events are stream metadata — they are not visible to providers, not counted by ConversationStream::len(), and are preserved by retain().

Projection Layer

The projection layer transforms the raw event stream into the view sent to the LLM. It is applied in Thread::into_parts(), which already filters events via is_provider_visible().

Algorithm

Collect all Compaction events from the stream.
For each conversation event at turn T with content type C: a. Find all compaction events where from_turn <= T <= to_turn and the policy for C is Some. b. Of those, the one with the latest timestamp wins. c. Apply the winning policy: keep, omit, strip, or substitute.
When summary is set: omit ALL provider-visible events in the range (messages, reasoning, tool calls). Inject a synthetic ChatRequest / ChatResponse::Message pair at the from_turn position containing the pre-computed summary. When summary is set, the reasoning and tool_calls policies are ignored for events in the range.

This logic lives in a new ConversationStream::projected_iter() method (or similar), called by Thread::into_parts() instead of the raw iterator.

Projection Example

A concrete example showing how the projection applies across event types:

txt

Raw stream (turns 0-2, then turns 3+ uncompacted):

  Turn 0: TurnStart
  Turn 0: ChatRequest("set up the project")
  Turn 0: ChatResponse::Message("I'll create the project structure.")
  Turn 0: ToolCallRequest(id="1", fs_create_file, {path: "src/main.rs"})
  Turn 0: ToolCallResponse(id="1", ok, "<200 lines of code>")
  Turn 0: ChatResponse::Message("Created src/main.rs with a basic setup.")
  Turn 1: TurnStart
  Turn 1: ChatRequest("add error handling")
  Turn 1: ChatResponse::Reasoning("<500 tokens of thinking>")
  Turn 1: ToolCallRequest(id="2", fs_read_file, {path: "src/main.rs"})
  Turn 1: ToolCallResponse(id="2", ok, "<200 lines of code>")
  Turn 1: ToolCallRequest(id="3", fs_modify_file, {path: "src/main.rs"})
  Turn 1: ToolCallResponse(id="3", ok, "<300 lines of diff>")
  Turn 1: ChatResponse::Message("Added error handling to main.")
  Turn 2: TurnStart
  Turn 2: ChatRequest("now add logging")
  Turn 2: ChatResponse::Reasoning("<400 tokens of thinking>")
  Turn 2: ToolCallRequest(id="4", fs_modify_file, {path: "src/main.rs"})
  Turn 2: ToolCallResponse(id="4", ok, "<250 lines of diff>")
  Turn 2: ChatResponse::Message("Added tracing-based logging.")

With default config (reasoning: strip, tool_calls: strip):

txt

Compaction event (after turn 2):
  from_turn: 0, to_turn: 2
  summary: None
  reasoning: Strip
  tool_calls: Strip { request: true, response: true }

Projected view:

  ChatRequest("set up the project")
  ChatResponse::Message("I'll create the project structure.")
  ToolCallRequest(id="1", fs_create_file, {[compacted]})
  ToolCallResponse(id="1", ok, "[compacted] fs_create_file: success")
  ChatResponse::Message("Created src/main.rs with a basic setup.")
  ChatRequest("add error handling")
  ToolCallRequest(id="2", fs_read_file, {[compacted]})
  ToolCallResponse(id="2", ok, "[compacted] fs_read_file: success")
  ToolCallRequest(id="3", fs_modify_file, {[compacted]})
  ToolCallResponse(id="3", ok, "[compacted] fs_modify_file: success")
  ChatResponse::Message("Added error handling to main.")
  ChatRequest("now add logging")
  ToolCallRequest(id="4", fs_modify_file, {[compacted]})
  ToolCallResponse(id="4", ok, "[compacted] fs_modify_file: success")
  ChatResponse::Message("Added tracing-based logging.")
  ...turns 3+ uncompacted...

Reasoning is stripped, and every tool call's arguments and responses are compacted uniformly. Messages and conversation structure are preserved. Selective, per-tool exemptions (e.g. keeping fs_read_file arguments while stripping fs_create_file content) are deferred to a future RFD.

With a summarization config (-c compaction/heavy):

txt

Compaction event (after turn 2):
  from_turn: 0, to_turn: 2
  summary: SummaryPolicy { summary: "Set up a Rust project at src/main.rs
    with error handling and tracing-based logging." }
  reasoning: None
  tool_calls: None

Projected view:

  ChatRequest("[Summary of previous conversation]")
  ChatResponse::Message("Set up a Rust project at src/main.rs
    with error handling and tracing-based logging.")
  ...turns 3+ uncompacted...

These two configurations show the distinction:

Mechanical (default): Conversation structure is preserved. Reasoning is stripped, tool responses are replaced with status lines. Messages and tool call requests remain — the model sees the full flow of what happened, minus the bulk.
Summarization (heavy): Everything in the range is replaced by a single summary. The summarizer reads ALL raw events (messages, reasoning, tool calls) to produce the summary, so tool usage and decisions are captured in the text. No orphaned events remain.

When summary is set, reasoning and tool_calls are ignored — the summary replaces everything. They only apply when compacting without summarization.

Stacking Semantics

Multiple compaction events compose independently per content type. For each event, per content type, the latest compaction whose range covers that event wins.

Example:

txt

Compaction A (turn 20): from=0, to=20, summary=SummaryPolicy("...")
Compaction B (turn 30): from=0, to=30, tool_calls=Strip { request: false, response: true }

Turn	Event type	A	B	Winner
5	Any	Summarize	—	A: Summarize
5	Tool calls	Summarize	Strip	A: Summarize*
25	Tool calls	out of range	Strip	B: Strip
25	Reasoning	out of range	—	Keep

* summary takes precedence over per-type policies when both cover an event.

This stacking behavior is what makes multi-rule configurations and the DSL work: each rule produces a separate compaction event, and the projection layer composes them at read time. Rule ordering does not affect correctness — the projection resolves conflicts by timestamp, and summaries always read the raw (uncompacted) stream.

Summary Overlap Resolution

Summaries are holistic representations of a range — they cannot be split or sliced. Partial overlaps between summary ranges would produce irreconcilable conflicts (two summaries covering partially the same turns, potentially contradicting each other).

Rule: when creating a new compaction with summary: Some(SummaryPolicy), if any existing summary compaction partially overlaps with the new range, the new range is auto-extended to fully subsume the existing one.

Formally: given new range [X, Y] and existing summary range [A, B], if the ranges intersect but neither fully contains the other, extend to [min(X, A), max(Y, B)]. Repeat until no partial overlaps remain. The summarizer then reads raw events for the extended range.

This constraint applies only when summary is set. All other policies operate per-event and compose naturally with partial overlaps.

Raw-Stream Invariant

Summarization always reads the raw (non-compacted) event stream. The summarizer sees the original messages, not prior summaries. This prevents compound information loss — summarizing a summary degrades quality at each step.

When compaction B's range overlaps with compaction A's range, B's summarizer reads the original events for its full range, ignoring A's summary entirely. At projection time, B's summary wins for the overlapping region (it has a later timestamp), and it is a faithful summary of the originals.

This is already guaranteed by the additive design — the raw events are always in events.json — but it is worth stating as an invariant: no code path should feed a projected view to a summarizer.

Strategies

A strategy is a function that analyzes a ConversationStream and produces a Compaction event. Strategies do not mutate the stream.

Mechanical Strategies

These are pure transformations that don't require LLM calls.

`strip-reasoning`

Produces a compaction with reasoning: Some(ReasoningPolicy::Strip) for the specified range.

Impact: Moderate token reduction for models that emit extended thinking.

`strip-tools`

Produces a compaction with tool_calls: Some(ToolCallPolicy::Strip { .. }) for the specified range. At projection time, tool response content is replaced with a status line ([compacted] {tool_name}: {success|error}) and/or request arguments are replaced with a compact summary. Which fields are stripped depends on the rule configuration.

Impact: High. Tool responses and arguments (especially for file-writing tools) dominate token count in coding conversations.

LLM-Assisted Strategies

`summarize`

Sends the raw events in the specified range to an LLM with instructions to produce a concise summary. Produces a compaction with summary: Some(SummaryPolicy { summary }). When set, this replaces all provider-visible events in the range.

The summarization prompt instructs the model to preserve key decisions, file paths, error resolutions, and the current state of the task. The model and prompt are configurable (see Configuration).

Impact: High. Replaces an arbitrary number of turns with a short summary.

Configuration

Compaction is configured at the workspace and conversation level under conversation.compaction. Configuration defines compaction rules — each rule produces one Compaction event when applied. Variation across workspaces or conversations is handled through JP's standard config layering (-c flag, config.d/ directories), not through a custom profile mechanism.

toml

[conversation.compaction]
# Reserved for future features (e.g. auto-compaction).

# Rules are applied in order. Each rule produces one compaction event.
[[conversation.compaction.rules]]
keep_first = 1
keep_last = 3
reasoning = "strip"
tool_calls = "strip"

To define alternative compaction configurations, create config files in the workspace's config.d/ directory and load them with -c:

toml

# .jp/config.d/compaction/heavy.toml
#
# Usage: jp -c compaction/heavy conversation compact
#        jp -c compaction/heavy query --compact -- "Continue"

[[conversation.compaction.rules]]
keep_last = 20
reasoning = "strip"

[[conversation.compaction.rules]]
keep_first = 1
keep_last = 3
tool_calls = "strip"

[conversation.compaction.rules.summary.model]
id = "anthropic/claude-haiku"

toml

# .jp/config.d/compaction/light.toml
#
# Usage: jp -c compaction/light conversation compact

[[conversation.compaction.rules]]
keep_last = 5
reasoning = "strip"

Multiple -c files compose via MergedVec append semantics:

# Appends both rule sets: strip reasoning + summarize middle
jp -c compaction/strip-reasoning -c compaction/summarize-middle conversation compact

Rules Array and Merging

The rules field is a MergedVec<CompactionRuleConfig> with append as the default merge strategy. When multiple config sources define rules, they are concatenated in load order.

The built-in default (strip reasoning + tools, keep last 3) uses discard_when_merged: true, so it is dropped as soon as any user-defined rule is present. This means compaction works out of the box without configuration, but defining even one rule replaces the defaults entirely.

If no rules are configured (and no DSL spec is provided), jp conversation compact applies the built-in default.

Rule Fields

Each rule in the array defines a single compaction operation:

Field	Type	Default	Description
`keep_first`	turns or duration	`1`	Turns to preserve at the start.
`keep_last`	turns or duration	`3`	Turns to preserve at the end.
`reasoning`	`"strip"`	—	Strip reasoning blocks.
`tool_calls`	mode string	—	Strip or omit tool calls.
`summary`	table	—	Generate an LLM summary.

keep_first and keep_last accept a positive integer (turn count) or a duration string (e.g. "5h").

tool_calls accepts: "strip" (both), "strip-responses", "strip-requests", "omit".

summary is a nested table. model is itself a table (it mirrors assistant.model), so the model id goes under summary.model.id:

toml

[conversation.compaction.rules.summary]
instructions = "..." # optional, custom summarization prompt

# optional, defaults to the main assistant model
[conversation.compaction.rules.summary.model]
id = "anthropic/claude-haiku"

When summary is set, it replaces all events in the range — reasoning and tool_calls on the same rule are ignored.

Per-Tool Compaction Hints

Per-tool hints — letting individual tools declare whether their request arguments and/or response content should be exempt from stripping — are deferred to a future RFD. Compaction here applies the tool_calls policies uniformly across all tools.

Drawbacks

Summaries are lossy. Even though the original events are preserved, the LLM only sees the compacted view. A poor summary can mislead the model worse than a long conversation. Mitigation: summaries are generated from raw events (never from prior summaries), and the summarization model and prompt are configurable.
Storage growth. Compaction events add to the stream rather than reducing stored size. Summary text in SummaryPolicy can be non-trivial. In practice this is small compared to the tool responses they overlay, but it is additive rather than reductive.
Projection complexity. The projection layer adds a code path between the raw stream and the LLM. Bugs in projection logic could cause the LLM to see inconsistent state. Mitigation: the projection is a pure function of the stream, fully testable without LLM calls.
Prompt cache invalidation. Adding a compaction event changes the projected prefix, invalidating any cached conversation history. System prompt caching is unaffected (it is a separate prefix). This is acceptable for manual compaction but would be problematic for automatic compaction.
--dry-run cannot preview summaries. For mechanical strategies, dry-run accurately shows the projected view. For summarization, dry-run can only report "will generate a summary for turns X-Y using model Z" — it cannot show the actual summary without spending tokens on an LLM call, and re-running without --dry-run would produce a different summary anyway.

Alternatives

Destructive compaction (RFD 036)

The original design: strategies mutate the ConversationStream directly, with fork-by-default as a safety net. Rejected because:

Information loss. Once events are deleted, they're gone. Fork mitigates but doesn't solve — you end up with two conversations, one intact and one damaged.
No undo. Reverting a compaction requires restoring from the fork.
Fork proliferation. Each compaction creates a new conversation, cluttering the conversation list.
Conflated concerns. Destructive compaction mixes "what to send to the LLM" (a view concern) with "what to store on disk" (a persistence concern).

Named compaction profiles

A profiles map keyed by name (e.g. default, heavy, light) inside conversation.compaction, with a --profile flag to select one at invocation time. Rejected because JP's config pipeline already provides this capability: variant configs live in config.d/ files and are loaded with -c. Profiles would duplicate the config layering mechanism with a compaction-specific lookup that adds complexity without adding capability.

Automatic compaction on every turn

Compact transparently when approaching the context window limit. Rejected for this RFD: compaction is lossy and should be an explicit user decision. Automatic compaction has additional design constraints (caching interaction, interval control, trigger conditions) that warrant a separate proposal.

Single monolithic compact operation

One "compact" that does everything. Rejected: different conversations need different compaction. A coding conversation benefits from tool response stripping; a discussion benefits from summarization. Composable rules with per-type policies let users tailor the operation.

Non-Goals

Automatic compaction. This RFD covers explicit, user-initiated compaction. Automatic compaction (triggered by context window proximity or turn count thresholds) has different design constraints — caching interaction, trigger intervals, rolling window semantics — and is deferred to a follow-up RFD. The config namespace (conversation.compaction.auto) is reserved.
Provider-delegated compaction. Some providers offer server-side compaction (Anthropic returns readable summaries, OpenAI returns opaque encrypted blobs). In practice, readable provider summaries offer no advantage over JP's own SummaryPolicy using the same model, and opaque formats cannot be integrated into JP's event model. Provider delegation may become interesting if providers offer compaction capabilities that can't be replicated client-side, but that's not the case today.
Custom external strategies. An extension point where an external binary receives the raw events and range, and returns replacement events that JP stores in the compaction event. This is analogous to how tools work today (external process, structured I/O) and would enable domain-specific compaction logic. The compaction event model supports this (replacement events are just the policy payloads), but the protocol and CLI integration are deferred.
Tool subsumption protocol. RFD 036 proposed an Action::Subsumes tool protocol extension where tools could declare that one call subsumes another (e.g., read_file(1,10) subsumes read_file(2,5)). This is a refinement that can be added later without changing the compaction event model.
Interactive tangent classification. RFD 036 proposed a classify-tangents strategy that uses an LLM to identify off-topic turns. Interesting but orthogonal to the core compaction model.
Tool call deduplication. Identifying and removing duplicate tool calls (same name, same arguments) across turns. While potentially useful, it adds complexity to the compaction model (per-call selective policies) for marginal benefit. Can be added as a ToolCallPolicy::Selective variant later if needed.
Conversation merging. Combining two conversations into one.
Interactive stream editing. A $EDITOR-based workflow for manually removing or reordering events in the raw stream (similar to git rebase -i). This is a separate, destructive operation on the stored events — distinct from compaction's non-destructive overlay model — and warrants its own RFD.

Risks and Open Questions

Summarization prompt design. The summary needs to preserve the right context — key decisions, file paths, error resolutions, task state. What should the prompt look like? This needs experimentation during implementation. We should take inspiration from Anthropic's default summarization prompt.
Turn boundary correctness. Range resolution must handle edge cases: conversations with only 1 turn, turns with no tool calls, interrupted turns. The existing fork --last implementation is a reference.
Config delta preservation. ConversationStream interleaves ConfigDelta events with conversation events. The projection layer must preserve config deltas correctly — compacting a range should not affect config state for events outside that range.
Summary injection and provider expectations. The synthetic ChatRequest/ChatResponse pair injected for summaries must maintain the user/assistant alternation that providers expect. Needs testing across Anthropic, OpenAI, Google, and local providers.
Migration of Event::Patch to the overlay model. The Event::Patch mechanism (introduced for stale thinking-block signature recovery in Anthropic and Google providers) currently mutates historical events in the conversation stream in-place. This is a known deviation from the append-only principle. Once the projection layer from Phase 2 exists, Event::Patch should be migrated to append a metadata-patch event to the stream, with the projection layer applying it at request-build time. The PatchAction vocabulary should not be expanded beyond RemoveMetadata until this migration is complete.

Implementation Plan

Phase 1: Compaction Event Model

Add InternalEvent::Compaction(Compaction) to jp_conversation.
Define the Compaction struct, per-type policy enums, and serialization.
Update ConversationStream to handle the new variant: is_empty(), len(), retain(), sanitize() should treat compaction events like config deltas (preserved, not counted).
Add unit tests for serialization roundtrip and stream invariants.

Can be merged independently. No behavioral changes.

Phase 2: Projection Layer

Add ConversationStream::apply_projection() that applies compaction overlays to transform the event list.
Implement the stacking semantics (latest-wins per content type).
Implement summary injection (synthetic ChatRequest/ChatResponse pair).
Wire Thread::into_parts() to call apply_projection().
Add unit tests for each policy type, stacking, and summary overlap auto-extension.

Depends on Phase 1. After this phase, compaction events in the stream will affect what the LLM sees.

Phase 3: Mechanical Strategies and CLI

Implement strategy functions: strip_reasoning, strip_tools. Each produces a Compaction event.
Implement range bound resolution (negative integers, duration strings, last → absolute turn index).
Add the jp conversation compact command with --keep-first, --keep-last, --from, --to, --reasoning, --tools, --dry-run, --reset.
Add --compact / -k to jp conversation fork.
Add --compacted to jp conversation print.
Add integration tests.

Depends on Phase 2.

Phase 4: Configuration and DSL

Add conversation.compaction config section with rules as MergedVec<CompactionRuleConfig>.
Implement built-in default rule with discard_when_merged: true.
Implement the --compact[=SPEC] DSL parser.
Wire --compact / -k into query, fork, and compact with composition semantics (bare --compact = config rules, --compact=SPEC = DSL rule, both compose).
Add config and DSL tests.

Depends on Phase 3. Can be partially parallelized with Phase 3 (config types and DSL parser can be built before the CLI is wired up).

Phase 5: LLM-Assisted Summarization

Implement the summarize strategy: read raw events, call the configured model, produce SummaryPolicy { summary }.
Implement the summary overlap auto-extension logic.
Add integration tests (with mock LLM).

Depends on Phase 2. Can proceed in parallel with Phases 3 and 4.

References

RFD 011 — System Notification Queue (compaction interaction)
RFD 034 — Inquiry-Specific Assistant Configuration (defers compaction)
RFD 036 — Conversation Compaction (superseded by this RFD)
Issue #57 — Make conversation management more powerful
Multi-turn degradation paper — cited in Issue #57

RFD 064: Non-Destructive Conversation Compaction ​

Summary ​

Motivation ​

Design ​

Core Concept: Compaction as Overlay ​

User-Facing Behavior ​

The compact Command ​

The --compact Flag (DSL) ​

DSL Grammar ​

Viewing Compacted Conversations ​

Compaction Event Model ​

The Compaction Type ​

Per-Content-Type Policies ​

Eagerness Principle ​

Integration with InternalEvent ​

Projection Layer ​

Algorithm ​

Projection Example ​

Stacking Semantics ​

Summary Overlap Resolution ​

Raw-Stream Invariant ​

Strategies ​

Mechanical Strategies ​

strip-reasoning ​

strip-tools ​

LLM-Assisted Strategies ​

summarize ​

Configuration ​

Rules Array and Merging ​

Rule Fields ​

Per-Tool Compaction Hints ​

Drawbacks ​

Alternatives ​

Destructive compaction (RFD 036) ​

Named compaction profiles ​

Automatic compaction on every turn ​

Single monolithic compact operation ​

Non-Goals ​

Risks and Open Questions ​

Implementation Plan ​

Phase 1: Compaction Event Model ​

Phase 2: Projection Layer ​

Phase 3: Mechanical Strategies and CLI ​

Phase 4: Configuration and DSL ​

Phase 5: LLM-Assisted Summarization ​

References ​

RFD 064: Non-Destructive Conversation Compaction

Summary

Motivation

Design

Core Concept: Compaction as Overlay

User-Facing Behavior

The `compact` Command

The `--compact` Flag (DSL)

DSL Grammar

Viewing Compacted Conversations

Compaction Event Model

The `Compaction` Type

Per-Content-Type Policies

Eagerness Principle

Integration with `InternalEvent`

Projection Layer

Algorithm

Projection Example

Stacking Semantics

Summary Overlap Resolution

Raw-Stream Invariant

Strategies

Mechanical Strategies

`strip-reasoning`

`strip-tools`

LLM-Assisted Strategies

`summarize`

Configuration

Rules Array and Merging

Rule Fields

Per-Tool Compaction Hints

Drawbacks

Alternatives

Destructive compaction (RFD 036)

Named compaction profiles

Automatic compaction on every turn

Single monolithic compact operation

Non-Goals

Risks and Open Questions

Implementation Plan

Phase 1: Compaction Event Model

Phase 2: Projection Layer

Phase 3: Mechanical Strategies and CLI

Phase 4: Configuration and DSL

Phase 5: LLM-Assisted Summarization

References