RFD 053: Auto-Refresh Conversation Titles
- Status: Discussion
- Category: Design
- Authors: Jean Mertz git@jeanmertz.com
- Date: 2026-03-18
Summary
Conversation titles are generated once after the first turn and never automatically updated. This RFD adds periodic refresh: when conversation.title.generate.auto_refresh.turn_interval is set to a positive integer N (default 5), the least recently activated conversations that have accumulated N new turns since their titles were last generated are re-titled as background tasks on the next jp query run.
Motivation
A title is generated on the first turn of a new conversation and then frozen. This works well for short, focused conversations, but longer ones take unexpected turns and end up with a title that describes only the opening exchange. The user is left with a list of conversations whose titles no longer reflect what they contain.
The user can already run conversation edit --title to manually regenerate a title, but this requires noticing the problem and taking action. Periodic automatic refresh should be transparent.
The fix needs to be careful about cost. Triggering a new LLM request on every jp query invocation for every stale conversation in the workspace would spike API usage and potentially delay the CLI on exit. This design processes a bounded number of conversations per run in the background — the same pattern already used for initial title generation.
Design
Configuration
A new auto_refresh sub-table is added to conversation.title.generate:
[conversation.title.generate]
auto = true
model = ...
[conversation.title.generate.auto_refresh]
turn_interval = 5 # refresh every N turns; 0 = disabled (default = 5)
batch_size = 1 # max conversations to refresh per run; or "all" (default = 1)
turn_context = 10 # max turns sent to LLM for re-titling (default = 10)turn_interval = 0 disables the feature entirely.
batch_size controls how many stale conversations are refreshed per jp query invocation. The default of 1 spreads the work across runs. Setting it to "all" refreshes every stale conversation in a single run — useful for catching up after enabling the feature on a workspace with many long-running conversations, at the cost of more LLM requests.
turn_context limits how many recent turns are sent to the LLM when re-generating a title. For long conversations, earlier turns are often irrelevant to what the conversation is currently about. The default of 10 keeps costs predictable and focuses the title on recent activity. Setting it to false disables the limit and sends the full conversation.
These fields map to a new nested AutoRefreshConfig on GenerateConfig:
pub struct AutoRefreshConfig {
pub turn_interval: usize, // default 5
pub batch_size: BatchSize, // default Count(1)
pub turn_context: Option<usize>, // default Some(10), None = unlimited
}
enum BatchSize {
Count(usize),
All,
}When conversation.title.generate.auto = false, auto-refresh is also disabled regardless of turn_interval.
Persisted state: title_generated_at_turn
To determine staleness, the system needs to know how many turns existed when the title was last generated. A new optional field is added to Conversation in metadata.json:
/// Turn count when the title was last auto-generated.
///
/// - `None` — legacy conversation; treated as baseline 0 (auto-enrolls in
/// refresh once it has accumulated `turn_interval` turns).
/// - `Some(n)` — title was auto-generated when the conversation had `n` turns.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub title_generated_at_turn: Option<usize>,This field is a pure watermark, not configuration. It lives in metadata.json alongside title because they are tightly coupled — one is the output, the other is the checkpoint that governs when it is regenerated.
| Value | Meaning | Auto-refresh eligible? |
|---|---|---|
None | Legacy conversation (pre-feature) | Yes, baseline 0 |
Some(n) | Auto-generated at turn n | Yes, baseline n |
Interaction with conversation edit --title
conversation edit --title (no argument) regenerates the title via LLM. This sets title_generated_at_turn = Some(current_turn_count), enrolling the conversation in future auto-refresh with the current position as baseline.
conversation edit --title "My Custom Title" sets a user-provided title and writes a ConfigDelta event with auto_refresh.turn_interval = 0 to the conversation's event stream, disabling auto-refresh for that conversation. The user explicitly chose this title; the system should not overwrite it. To re-enable auto-refresh later, the user can run config set conversation.title.generate.auto_refresh.turn_interval 5 (or any positive value) to write a new ConfigDelta that re-enrolls the conversation.
conversation edit --no-title removes the title entirely and writes a ConfigDelta event with auto_refresh.turn_interval = 0, disabling auto-refresh for that conversation. The user explicitly chose to have no title; the system should not generate one later.
Background task
A new TitleRefreshTask runs the full refresh pipeline in the background: candidate scanning, turn counting, stream loading, and LLM calls. The main thread's only responsibility is spawning the task — all heavy I/O happens off the critical path.
Spawn (main thread)
At the start of a jp query invocation, after the workspace is loaded, a single TitleRefreshTask is spawned if turn_interval > 0 and auto = true. The task receives:
- A
Storagehandle for filesystem access.Storageencapsulates directory structure, dual-root resolution (workspace + user), and file I/O. The task uses its existing methods (load_all_conversation_ids,load_conversation_metadata,load_conversation_stream) rather than performing raw filesystem walks. - The active conversation ID (to exclude it from candidacy).
- The
AutoRefreshConfigfrom configuration. - Provider and model configuration for the LLM call.
No conversation scanning or event file reading happens on the main thread.
Run (background)
The task performs the following steps inside run():
List all conversation IDs via
Storage::load_all_conversation_ids.For each conversation, load metadata via
Storage::load_conversation_metadatato gettitle,title_generated_at_turn,last_activated_at, andturn_count.load_conversation_metadataalready calls the lightweightload_count_and_timestamp_eventsto populateevents_countandlast_event_atfromevents.json. This function is extended to also deserialize thetypefield and countturn_startevents, populating a newturn_countfield onConversation. The extension adds one field to the internalRawEventstruct:rust#[derive(serde::Deserialize)] struct RawEvent { timestamp: Box<serde_json::value::RawValue>, #[serde(rename = "type")] event_type: Box<serde_json::value::RawValue>, }A
turn_startevent is counted whenevent_type.get()equals"\"turn_start\"". This avoids a second pass overevents.json.Skip the active conversation (it is being actively worked on).
Compute staleness: a conversation is stale if
turn_count >= baseline + turn_interval, wherebaselineistitle_generated_at_turn.unwrap_or(0).Sort stale conversations by
last_activated_atascending (least recently active first).Take up to
batch_sizecandidates.For each candidate, load the full
ConversationStreamviaStorage::load_conversation_stream. Check the conversation's merged config viastream.config()— ifturn_intervalhas been overridden to0via aConfigDelta(e.g., byconversation edit --title "..."), skip this candidate. Otherwise, scope the stream to the lastturn_contextturns (if configured) and make the LLM title generation call.
If a file read fails (e.g., partially written by a concurrent session), the task skips that conversation and moves to the next.
The task checks its CancellationToken between each candidate. If cancelled (e.g., the user's query has finished and TaskHandler is shutting down), the task stops immediately — it does not wait for an in-flight LLM response or process remaining candidates. Any titles already generated by completed candidates are kept and synced; incomplete work is discarded. This avoids delaying CLI exit when batch_size is large or set to "all".
Title retention schema
To avoid unnecessary title churn, the title generation schema is extended with a retain_current field. The LLM receives the current title in its prompt and can indicate that it is still adequate:
{
"retain_current": false,
"titles": [
"New title suggestion"
]
}The prompt includes:
The conversation currently has the title: "{current_title}". If this title still accurately describes the conversation, set
retain_currenttotrue. Only generate new titles if the conversation has meaningfully changed direction.
When retain_current is true, the task advances the title_generated_at_turn checkpoint (recording that the title was evaluated) but leaves title unchanged. This prevents the same conversation from being re-evaluated on every run while keeping its perfectly good title.
Context window safety
The turn_context setting (default 10) provides the first line of defense against oversized requests: only the most recent N turns are sent to the LLM. This scoping happens before any token-level checks.
The title generation model may still have a smaller context window than those N turns require. The inquiry system already solves this problem: it estimates char-based token counts and drops older events to fit the model's context window.
The core truncation logic — estimate chars, compare to budget, drop oldest events, re-sanitize — is extracted from jp_cli::cmd::query::tool::inquiry into a shared utility (in jp_llm or jp_conversation) that both the inquiry backend and the title generator can use. Each caller computes its own overhead (the inquiry system accounts for tools, attachments, and cache-preserving granularity; the title generator only needs system prompt and title instructions).
The pipeline for each candidate is: scope to last turn_context turns > estimate chars > truncate if over budget > send to LLM.
Sync (main thread)
After the background task completes, sync writes results back to the workspace. For each successfully refreshed candidate:
- Update
conversation.title(unlessretain_currentwastrue). - Set
conversation.title_generated_at_turn = Some(turn_count_at_evaluation), whereturn_count_at_evaluationis the turn count observed by the background task when it read the conversation's events.
Using the count at evaluation time rather than at sync time means the checkpoint advances by what was true when the decision was made, not by any turns added during the current session.
If a candidate fails (LLM error, parse failure), the task logs a warning and skips it. Successful candidates are still synced.
Interaction with conversation locks (RFD 020)
Once RFD 020 lands, conversations are protected by exclusive file locks during write operations. The title refresh task interacts with locks as follows:
Read phase (run): The task reads metadata.json and events.json through Storage without acquiring a lock. Read-only access does not require a lock per RFD 020's model. If a concurrent session is writing to a conversation and the file is partially written, the JSON parse fails and the task skips that conversation.
Write phase (sync): The task attempts a non-blocking lock (try_lock) on each candidate conversation before writing. If the lock is held by another session, the title update for that conversation is discarded — the work is lost but no corruption occurs. The conversation will be retried on the next eligible run. If the lock is free, the task acquires it, writes the metadata update, and releases it.
This approach avoids blocking the CLI exit on lock contention and naturally handles the common case: stale conversations are by definition idle, so lock contention on them is rare.
Spawn location
The TitleRefreshTask is spawned in query.rs, alongside the existing first-turn title spawn. This restricts title refresh to jp query — the only command with a meaningful conversation lifetime and where an LLM call is already expected. Short-lived commands (conversation ls, conversation edit, etc.) do not trigger it.
The existing first-turn title spawn in query.rs is updated to set title_generated_at_turn = Some(1), so all new conversations have a baseline and become eligible for future auto-refresh.
Drawbacks
Each jp query run spawns a background task that loads metadata for every conversation via Storage. Since load_conversation_metadata already reads events.json (for events_count and last_event_at), the turn counting extension adds no extra file reads — it piggybacks on the existing lightweight parse. For workspaces with hundreds of conversations this is nonzero I/O, though it happens entirely in the background and does not delay the user's query.
The retain_current schema adds a small amount of complexity to the title generation prompt and response handling. Models may occasionally set retain_current = false and produce a title that is semantically identical to the original, causing cosmetic churn. This is a minor nuisance, not a correctness issue.
Alternatives
Timestamp-based staleness. Track when the title was generated and refresh if enough time has elapsed. Rejected: time is a weaker signal than turns. A conversation that receives one turn per day and one that receives twenty turns per hour have the same time-based staleness but very different content drift.
Use events_count as a proxy for turns. Already computed and readily available. Rejected: it's imprecise. A single turn with heavy tool use generates many events; the threshold would behave inconsistently across different usage patterns. Turn count is the right unit.
Cache turn_count in metadata.json. Avoids reading events.json during candidacy checks. Rejected: this introduces derived state from events.json into metadata.json, breaking the convention that all conversation-level behavioral state flows through the event stream's ConfigDelta. The background task architecture makes this optimization unnecessary — the I/O happens off the critical path.
Scan on the main thread, load streams in the background. Perform candidate selection synchronously and only push the LLM call to the background. Rejected: candidate scanning requires reading metadata.json for every conversation and events.json for stale candidates. This forces eager loading of all conversation metadata on the main thread, changing jp query startup from O(1) disk reads (active conversation only) to O(N). Moving the entire pipeline to the background keeps startup cost at O(1).
Non-Goals
This RFD does not change when or how the initial title is generated. The first-turn behavior is unchanged except for setting title_generated_at_turn.
It does not add any user-visible indication that a title was refreshed in the background.
Risks and Open Questions
Concurrent CLI runs. Two simultaneous jp query invocations could both spawn a TitleRefreshTask that selects the same stale conversation. The result is two LLM requests producing the same (or a slightly different) title — no data corruption, just a wasted request. Once RFD 020 lands, the sync-phase locking prevents concurrent metadata writes; the second task's try_lock fails and the update is discarded.
Token cost of re-titling long conversations. The turn_context default of 10 bounds the typical cost, but users who set turn_context = false (unlimited) or have very long individual turns may still send large payloads. The context window truncation utility provides a hard safety net, but the cost scales with the retained context size. Worth monitoring once the feature ships.
Title quality on truncated context. Both turn_context scoping and context window truncation mean the LLM sees only a suffix of the conversation. The generated title will reflect recent activity rather than the full arc. This is an acceptable trade-off — recent activity is usually more relevant to what the user is currently working on — but users should be aware that titles may shift focus as the conversation evolves.
Implementation Plan
Phase 0: Shared truncation utility (independent)
- Extract the core truncation logic (estimate chars, compare to budget, drop oldest events, re-sanitize) from
jp_cli::cmd::query::tool::inquiryinto a shared utility injp_llmorjp_conversation. - Update the inquiry backend to use the shared utility.
- Update
TitleGeneratorTask::update_titleto truncate the event stream when the title model's context window is smaller than the conversation.
Phase 1: State (independent)
- Add
title_generated_at_turn: Option<usize>toConversationinjp_conversation. - Add
turn_count: usize(computed,#[serde(skip)]) toConversation. - Extend
load_count_and_timestamp_eventsinjp_storageto countturn_startevents and populateturn_count. - Update
conversation edit --title(no argument) to settitle_generated_at_turn = Some(current_turn_count)after LLM generation. - Update
conversation edit --title "..."(user-provided) to write aConfigDeltawithauto_refresh.turn_interval = 0. - Update
conversation edit --no-titleto write aConfigDeltawithauto_refresh.turn_interval = 0.
Phase 2: Configuration (independent)
- Add
AutoRefreshConfig(withturn_interval,batch_size,turn_context) as a nested config onGenerateConfiginjp_config. - Wire through
AssignKeyValue,PartialConfigDelta, andToPartialimpls.
Phase 3: Title retention schema (independent)
- Extend
title_schemaandtitle_instructionsinjp_llm::titlewith theretain_currentfield and current-title prompt context. - Update
extract_titles(or add a companion function) to handle theretain_currentresponse.
Phase 4: Task and spawn (depends on Phase 0, 1, 2, 3)
- Implement
TitleRefreshTaskwith the full background pipeline: directory walk, metadata reading, turn counting, stream loading, LLM calls, and per-candidate sync withtry_lock. - Update the existing first-turn title spawn in
query.rsto settitle_generated_at_turn = Some(1). - Spawn
TitleRefreshTaskinquery.rswhenturn_interval > 0andauto = true.
Phases 0–3 can be reviewed and merged independently. Phase 4 depends on all of them.