RFD D03: External Attachment URI Scheme
- Status: Draft
- Category: Design
- Authors: Jean Mertz git@jeanmertz.com
- Date: 2026-04-01
- Requires: RFD 065, RFD 066
Summary
This RFD introduces support for attaching files from outside the workspace directory (e.g. jp q --attach ~/Downloads/report.pdf) and defines the external: URI scheme for identifying these resources. External attachments are content-snapshotted on attach, privacy-safe (no absolute paths stored in conversation state), and compatible with RFD 066's blob store and RFD 067's deduplication.
Motivation
JP's file attachment handler resolves paths relative to the workspace root and rejects files outside it:
// jp_attachment_file_content/src/lib.rs
let Ok(rel) = path.strip_prefix(cwd) else {
warn!(path = %path, "Attachment path outside of working directory, skipping.");
return None;
};This is intentional — the workspace is self-contained and shareable. But it prevents a common workflow: attaching a document downloaded from the web, a design spec from another project, or a data file from a shared drive.
Today, the workaround is to copy the file into the workspace first. This is inconvenient and pollutes the project directory with unrelated files.
Why not just allow absolute paths?
Conversations are shareable across team members via Git. If I attach /Users/jean/Downloads/report.pdf, that path is meaningless on my teammate's machine. Worse, it leaks my filesystem structure (username, directory layout) into shared conversation state.
With RFD 065's snapshot model, the content is captured at attachment time, so teammates can read the conversation — but the canonical URI still matters. It is what the LLM sees, what appears in jp attachment ls, and what RFD 067 uses for deduplication matching. Storing raw absolute paths as canonical URIs creates both a privacy problem and a portability problem.
What we need
A URI scheme for external attachments that:
- Preserves content via snapshot (no re-resolution from the original path).
- Does not leak the user's filesystem structure.
- Supports deduplication: attaching the same file twice from the same location produces the same URI.
- Avoids false deduplication: attaching files with the same name from different directories produces different URIs.
- Works with RFD 066's blob store and garbage collection.
- Works with RFD 067's
(canonical_uri, checksum)dedup matching.
Design
The external: URI scheme
External attachments use the external: scheme with a hashed directory component and the original filename:
external:<sha256-of-canonical-parent-directory>/<filename>Examples:
| Original path | Canonical parent | URI |
|---|---|---|
~/Downloads/report.pdf | /Users/jean/Downloads | external:a1b2c3.../report.pdf |
~/Desktop/report.pdf | /Users/jean/Desktop | external:f4e5d6.../report.pdf |
~/Downloads/report.pdf (same as first) | /Users/jean/Downloads | external:a1b2c3.../report.pdf |
The hash is SHA-256 of the canonicalized parent directory path (symlinks resolved, .. collapsed, ~ expanded). SHA-256 is chosen for consistency with RFD 066's blob store, which uses the same algorithm for content addressing.
URI construction
When the file attachment handler receives a path outside the workspace root:
- Canonicalize the full path (
std::fs::canonicalize). - Split into parent directory and filename.
- Compute
sha256(parent_directory_as_utf8_bytes). - Construct
external:<hex-digest>/<filename>.
The filename is preserved verbatim (not hashed) for LLM readability. The LLM sees external:a1b2c3.../report.pdf rather than a fully opaque identifier — the filename provides useful context for reasoning about the resource.
Deduplication behavior
RFD 067 deduplicates by (canonical_uri, checksum). The external: scheme produces deterministic URIs for the same (directory, filename) pair, which gives the correct dedup behavior:
| Scenario | Same URI? | Same checksum? | Dedup? |
|---|---|---|---|
| Same file, attached twice, content unchanged | yes | yes | yes |
| Same file, attached twice, content changed | yes | no | no |
| Same filename, different directories | no | maybe | no |
| Same file, different machines | no (different parent hash) | maybe | no |
The last row is correct: two people attaching the same file from their own machines are independent attachment acts. The content is snapshotted independently for each.
Privacy
No absolute path is stored in conversation state. The parent directory hash is opaque — it cannot be reversed to recover the original path. The filename is preserved, which could be considered a minor leak (it reveals that a file named report.pdf was attached), but this is unavoidable for LLM usability and no worse than the current source field on workspace-internal attachments.
Snapshot-only semantics
External attachments are not re-resolvable. The external: scheme signals to JP that the content exists only as a snapshot in the blob store. There is no source to refresh from.
If RFD 065's refresh_resource tool is called on an external: URI, it returns an error explaining that external resources cannot be refreshed — the original file may have moved, been deleted, or may not exist on the current machine.
Blob store and garbage collection
External attachment content is stored in the blob store (RFD 066) identically to workspace attachment content. The blob is referenced by SHA-256 checksum from the ChatRequest.resources entry in events.json.
Garbage collection works unchanged. GC scans events.json for referenced checksums and deletes unreferenced blobs. The URI scheme is metadata in events.json — GC never inspects URIs, only $blob checksum references. When a conversation containing an external attachment is deleted, the blob reference disappears and GC cleans up the blob.
Attachment handler changes
The file attachment handler (jp_attachment_file_content) currently rejects paths outside the workspace. This RFD changes the behavior:
- If the resolved path is inside the workspace root: produce a
file:///URI as today. No change. - If the resolved path is outside the workspace root: produce an
external:URI using the scheme described above.
In both cases, the content is read and snapshotted at attachment time. The difference is only in the canonical URI stored in the resource metadata.
The handler's add() method currently validates against the workspace root implicitly (via strip_prefix). This validation is relaxed to allow absolute paths and ~/ paths, while the get() method handles URI construction.
CLI interface
No new flags are needed. The existing --attach flag accepts the path:
jp q --attach ~/Downloads/report.pdf "Summarize this document"
jp q --attach /tmp/data.csv "Parse this data"Relative paths are resolved against the current working directory (as today). If the resolved path falls outside the workspace, the external: scheme is used. If inside, the file:/// scheme is used.
Drawbacks
External attachments are not refreshable. Once attached, the content is frozen. If the source file changes, the user must re-attach it. This is inherent to the snapshot model and consistent with the design principle that conversations are deterministic records.
Filename collisions in display. Two external attachments from different directories with the same filename (e.g. two different config.yaml files) appear identical in jp attachment ls unless the user inspects the full URI. The hashed directory prefix is not human-readable. In practice, the LLM can distinguish them by content and by their position in the conversation.
Alternatives
Store the original absolute path
Use the raw filesystem path as the canonical URI (e.g. file:///Users/jean/Downloads/report.pdf).
Rejected because it leaks the user's filesystem structure into shared conversation state. Teammate machines have different paths. The URI becomes meaningless and potentially sensitive.
UUID-based URIs
Use external:<uuid>/<filename> with a random UUID per attachment act.
Rejected because it breaks same-file deduplication. Attaching the same file twice produces two different URIs, so RFD 067 cannot deduplicate them even when the content is identical. The directory-hash approach preserves dedup for repeated attachments of the same file from the same location.
Content-hash URIs
Use blob:<sha256-of-content>/<filename> with the content hash as the identifier.
Rejected because it causes false deduplication. Two different files with identical content and the same filename would get the same URI, and RFD 067 would treat them as the same resource. The LLM would lose track of which file is which.
Non-Goals
Tool access to external files. This RFD only covers user-initiated attachments via --attach. Tools remain restricted to the workspace root (or the ProjectFiles VFS in a future design). Expanding tool access to arbitrary filesystem paths is a separate concern with different security implications.
Cross-machine re-resolution. External attachments are snapshots. There is no mechanism for a teammate to refresh an external attachment from their own copy of the file. If this becomes a need, a future RFD could introduce a "shared external resource" concept, but that is out of scope here.
Risks and Open Questions
Symlinks and mount points. std::fs::canonicalize resolves symlinks, which means ~/Downloads/report.pdf and ~/Dropbox/Downloads/report.pdf (if Downloads is a symlink) produce different canonical parents and thus different URIs. This is technically correct (they are different paths) but may surprise users who expect dedup to work across symlinks. The risk is low — this is an edge case.
Filename encoding. Non-UTF-8 filenames cannot be represented in the URI scheme. This is consistent with JP's existing UTF-8 requirement (the project uses camino::Utf8Path throughout). Non-UTF-8 filenames are rejected at the attachment handler level.
Implementation Plan
Phase 1: Allow external paths in the file handler
Relax the workspace-root restriction in jp_attachment_file_content. Construct external: URIs for out-of-workspace files. Content reading and snapshotting work unchanged — the file is read at attachment time regardless of location.
Depends on RFD 065 (typed resource model) being implemented, since the external: URI is stored on the Resource.uri field.
Depends on RFD 066 (blob store) being implemented, since external attachment content must be persisted in the blob store rather than inline.
Phase 2: Display and listing
Update jp attachment ls to display external attachments with a visual indicator (e.g. a different prefix or icon) so users can distinguish workspace and external resources at a glance.