Customizing prompts
The knowledge base’s LLM work runs in two places:
- The proposal-drain hook spawns the active harness’s headless driver to turn each captured session log into a structured proposal. Its prompt loads from a local override path, falling back to the bundled template. Edit
.ai/kenkeep/.config/prompts/proposal-extract.mdto customize; delete it to revert. - The kk-bootstrap, kk-curate, and kk-add skills run inside the host harness session (via
<harness> -p "/kk-<name>"or invoked directly). Their prompts live in the skill itself: edit.claude/skills/kk-<name>/SKILL.md(and the.codex/,.cursor/,.opencode/equivalents).
NOTE
On the Claude adapter the proposal-drain hook is a no-op: extraction runs inline during /kk-curate.
TIP
Bump the top-of-file Version: N comment on every behavior change. Logs record the prompt content, so historic decisions stay auditable.
Where each prompt lives
| Prompt | Role |
|---|---|
proposal-extract.md (v1 in the bundled template) | Converts a captured transcript into structured practice and map candidates. Run by kk-proposal-drain via the headless driver. |
kk-curate/SKILL.md | Reads pending session logs, drafts add/modify/contradict/drop actions in-session, hands the merged set to curate-dedup, walks contradictions with the user. (This is the curator logic; there is no separate curator.md.) |
kk-bootstrap/SKILL.md | Enumerates source markdown via finddocs, drafts node bodies inline, persists via node write. |
kk-add/SKILL.md | Conversationally gathers fields, persists via node write. |
| Surface | Source of truth | Local override |
|---|---|---|
| Proposal extraction | templates/prompts/proposal-extract.md | .config/prompts/proposal-extract.md |
| Curate skill | src/templates-source/skills/kk-curate/SKILL.md (regenerated into .claude/skills/kk-curate/SKILL.md etc.) | edit the per-harness copy directly |
| Bootstrap skill | src/templates-source/skills/kk-bootstrap/SKILL.md (regenerated similarly) | edit the per-harness copy directly |
| Manual-add skill | src/templates-source/skills/kk-add/SKILL.md (regenerated similarly) | edit the per-harness copy directly |
The templates-source SKILL.md is canonical; per-harness copies are regenerated from it by template-sync. Edit the per-harness copy when iterating in a consumer repo; edit the templates-source when contributing back to the package.
The durability filter
All three extractor/curator prompts apply the same filter: keep principles and current-state facts, drop actions and story. Specifically dropped:
- Maintenance and lifecycle actions.
- Project story or history, especially any reference to a plan, ticket, or issue.
- Incidental one-off facts dressed up as practices.
This is the single most common reason a candidate is rejected. The sections below reference it rather than restate it.
Pipeline overview
Two extractors emit candidate nodes: the proposal extractor for live sessions and the bootstrap skill for existing docs. The curator skill decides what becomes a file on disk.
flowchart TB
subgraph proposal["Proposal extraction · proposal-extract.md v1"]
direction TB
PI["Captured transcript<br/>role-tagged USER / AGENT segments"]
PG{"Session-disposition gate<br/>abandoned · exploratory<br/>unrelated · meta-only?"}
PEMPTY["Empty proposal<br/>practice=[], map=[]"]
PP1["Practice pass<br/>USER turns + self-review-apply<br/>(incl. corrective patterns)"]
PP2["Map pass<br/>USER or AGENT turns"]
PF2{"Task-specific scope?<br/>one-off names · 'in this PR'<br/>this file / this function"}
PF3{"End-state framing?<br/>present tense<br/>no transition narrative"}
PO["Practice + Map candidates<br/>tags · title · summary · body · confidence"]
PI --> PG
PG -- "non-productive" --> PEMPTY
PG -- "productive" --> PP1
PG -- "productive" --> PP2
PP1 --> PF2
PF2 -- "yes → drop" --> PEMPTY
PF2 -- "no" --> PF3
PP2 --> PF3
PF3 -- "no → drop" --> PEMPTY
PF3 -- "yes" --> PO
end
subgraph bootstrap["Bootstrap skill · kk-bootstrap/SKILL.md"]
direction TB
BI["One markdown doc<br/>=== FILE: path === ... === END FILE ==="]
BS{"Skip filter<br/>API dumps · boilerplate<br/>generic framework · TODOs?"}
BSKIP["Empty output"]
BP1["Practice pass<br/>imperatives · rationale markers<br/>admonitions"]
BP2["Map pass<br/>component headers · definitions<br/>file paths"]
BC["Confidence calibration<br/>high: explicit + maintained<br/>medium: default<br/>low: draft/legacy/ambiguous"]
BO["Practice + Map candidates"]
BI --> BS
BS -- "yes" --> BSKIP
BS -- "no" --> BP1
BS -- "no" --> BP2
BP1 --> BC
BP2 --> BC
BC --> BO
end
subgraph curator["Curator skill · kk-curate/SKILL.md"]
direction TB
CI["Batch of candidates<br/>existing_nodes always empty<br/>(overlap judged from candidate framing)"]
CG{"Non-productive<br/>provenance signals?<br/>hedged · hypothetical · plan-scoped"}
CCF{"Change-oriented framing?<br/>'used to X, now Y' · rename · removal"}
CSAL{"Clean end-state claim<br/>salvageable?"}
COV{"Suspected overlap<br/>with existing knowledge base node?"}
CNEG{"Direct negation?<br/>(both cannot be true<br/>in the same scope)"}
CEXT{"Extends without<br/>negating?"}
CADD["add<br/>writes nodes/<kind>/<slug>.md"]
CMOD["modify<br/>overwrites target node<br/>requires target_node_id on disk<br/>end-state rewrite rule"]
CCON["contradict<br/>writes conflicts/<id>.md<br/>no node touched"]
CDROP["drop<br/>(rationale recorded)"]
CI --> CG
CG -- "yes" --> CDROP
CG -- "no" --> CCF
CCF -- "yes" --> CSAL
CSAL -- "no" --> CDROP
CSAL -- "yes" --> COV
CCF -- "no" --> COV
COV -- "no" --> CADD
COV -- "yes" --> CNEG
CNEG -- "yes" --> CCON
CNEG -- "no" --> CEXT
CEXT -- "yes" --> CMOD
CEXT -- "no, rephrase only" --> CDROP
end
PO --> CI
BO --> CI
CADD --> NODES[("nodes/practice/<br/>nodes/map/")]
CMOD --> NODES
CCON --> CONF[("conflicts/<id>.md")]
Read top to bottom: each extractor short-circuits to an empty output when its gate fires; surviving candidates land in the curator, which routes every candidate to exactly one of four actions. The extractors never interact; the curator is the only stage that writes to nodes/ or conflicts/.
Proposal prompt
The biggest quality lever in capture: it controls what the extractor treats as worth remembering.
Sections
- Version comment.
- What to extract: practice/map definitions, trigger phrases.
- What to skip: typos, file reads, agent paraphrases, generic programming knowledge; the durability filter; and non-productive sessions (abandoned, exploratory, cursory, unrelated, meta-only), which short-circuit to
{"practice": [], "map": []}via the session-disposition gate at the top of the prompt. The gate fires when the session as a whole does not converge on durable knowledge. - Ownership boundary: how to split combined statements between practice and map.
- Inline example: a worked transcript with expected JSON.
- Output schema: must match
ProposalOutputSchema.
The drain replaces [TRANSCRIPT PLACEHOLDER, substituted at runtime] with the captured slice. If the placeholder is removed, the transcript is appended at the end.
Calibration
Fixtures under tests/fixtures/transcripts/:
routine-zero/: a session with no teaching moments. Correct output is empty.bravo-insider/: 4 practice + 3 map candidates.expected.mdis the target.
TIP
Mocked tests only pin the schema. Only a real headless harness run reveals prompt quality, so run the fixtures with the real CLI before shipping changes.
Schema
Output must match ProposalOutputSchema in src/lib/schemas.ts. New fields mean extending the Zod schema. Bump schema_version on rename, removal, or semantic change; new optional fields do not bump.
Curator skill prompt
The kk-curate skill’s SKILL.md decides what happens to every proposal candidate: add, modify, contradict, or drop. Runs in the host harness session. Second-biggest quality lever.
Input
[BATCH PLACEHOLDER] is replaced with:
{
"existing_nodes": [
{ "id": "...", "title": "...", "kind": "practice", "tags": ["..."], "summary": "...", "body": "..." }
],
"batch": [
{
"session_id": "...",
"captured_at": "...",
"derived_from": "session-<id>.md",
"practice_candidates": [...],
"map_candidates": [...]
}
]
}
existing_nodes carries only nodes referenced by supports_existing_node / contradicts_existing_node in the batch. The curator is told to drop any candidate that appears to overlap an existing node not present in existing_nodes, with a rationale naming the suspected overlap.
Output
A single JSON array. Each element:
{
"action": "add | modify | contradict | drop",
"candidate_origin": "<session_id>:<practice|map>:<index>",
"target_node_id": "<id-or-null>",
"proposed_node": { /* full node, or null for drop */ },
"rationale": "...",
"suggested_resolution": null
}
The skill applies actions via the deterministic curate-dedup and node write primitives:
| Action | Behavior |
|---|---|
add | node write atomically writes nodes/<kind>/<id>.md. If the file exists, it resolves the slug via ensureUniqueId (<id>-2, …); a true collision-after-resolution is reported as a failure. |
modify | node write runs against the existing nodes/<kind>/<target_node_id>.md. If the target is missing, the skill records a modify_missing_target failure. |
contradict | The action is piped to curate-dedup, which writes the conflict to .ai/kenkeep/conflicts/<id>.md (status: pending) and stamps the source session log. The skill then walks each conflict in-session with the user. |
drop | No-op. |
suggested_resolution is always emitted as null; resolution happens via the in-session walkthrough.
Verifying
npm test: curate-dedup tests assert that add/modify proposals survive to the output JSON, contradict actions become conflict files underconflicts/, and slug-collision-after-resolution lands in the failure report.- Inspect the harness session transcript for the final proposal JSON the skill piped to
curate-dedup.
Anti-patterns
- Modifications that rephrase existing content (drop instead).
- Additions when a near-duplicate exists (modify instead).
- Suggesting a
suggested_resolutionvalue (it is ignored; the user picks via the kk-curate skill). - Crossing the practice/map boundary.
- Change-oriented framing (transition narratives, migration stories, rename or removal logs): automatic drop regardless of confidence, unless a clean end-state claim can be salvaged.
- Durability-filter violations (see above): automatic drop, unless a durable operating principle or current-state fact can be salvaged.
- Non-productive provenance signatures: candidates whose framing carries hedged wording, references to hypothetical entities, plan- or task-scoped wording, or low-confidence-without-rationale. The curator weighs these signals together (not any single one) and treats a combined signature as evidence the candidate slipped the extractor’s session-disposition gate from an abandoned, exploratory, cursory, unrelated, or meta-only session.
Bootstrap skill prompt
Controls what the kk-bootstrap skill treats as candidates from your source docs. Runs in the host harness session.
Skill behavior
- Discovery: call
finddocs --from <scope> --with-hashesto enumerate candidate markdown. - Per-doc loop: for each surviving doc,
Readit, decide whether it carries durable knowledge (skipping auto-generated reference, licenses, generic framework knowledge, aspirational TODOs, and durability-filter cases), draft practice and map candidates inline, and persist vianode write --source-doc <relpath> --source-hash <sha256>(which folds the hash intobootstrap-state.jsonin the same atomic transaction). - Hash-aware skip: before reading a doc, compare its
finddocs --with-hashesdigest againstbootstrap-state.json. Skip on hit. - Finalize: call
index rebuildto regenerateINDEX.mdandGRAPH.md. - Rules: never invent facts, quote rationale verbatim, never overwrite an existing node.
Calibration loop
- Pick 3-5 representative docs.
finddocs --from <subset>to confirm scope.- Run
bootstrap --from <subset>. Review proposals as they land innodes/. - Note false positives and negatives. Adjust the “what to extract” and “what to skip” sections of
src/templates-source/skills/kk-bootstrap/SKILL.md(or the per-harness copy under.claude/skills/kk-bootstrap/SKILL.mdfor a consumer-side override). - Delete
bootstrap-state.jsonand re-run. - Repeat until acceptance lands around 60-80%. Higher rates tend to drop true positives.
Reading run logs
The proposal-drain hook writes a stream-JSON trace per run. Curate and bootstrap do not: their work is part of the host harness session transcript, captured wherever the user already captures that.
Proposal: _logs/proposal/<session-id>__<ts>.jsonl
| Line type | What it is |
|---|---|
system / init | Records session id and resolved model. |
assistant | Intermediate streamed turns. |
user | Rare follow-ups. |
result | Final message. Parsed as JSON, validated against ProposalOutputSchema. |
Common failures:
| Failure | Diagnosis |
|---|---|
| No final result | claude was killed or timed out. Check timestamps. The drain writes proposal_status: failed and does not retry on its own; these modes do not heal on retry. |
| Schema mismatch | Model emitted extra prose or skipped a field. Inspect result text; tune the prompt if consistent. |
To force re-extraction of a failed entry: set proposal_status: pending in the session log and clear proposal_error. The next drain sweep picks it up.
Curate / bootstrap
These do not write _logs/curator/*.jsonl or _logs/bootstrap-incremental/*.jsonl; the work runs in the host harness session and that transcript captures the reasoning. To inspect what the curate skill did on a run, read the host session transcript (or, for the curate-dedup primitive’s output, pass --output <path> to capture the surviving-actions JSON).
| Issue | Diagnosis |
|---|---|
nodesWritten: 0 despite a non-empty batch | Check the host session transcript for skill-reported failures (slug collision, modify_missing_target, or add_collision). |
Conflict not surfacing in /kk-curate | Check .ai/kenkeep/conflicts/ for a file with status: pending. The skill walks from there. |
| Duplicates after dedup | curate-dedup keeps the higher-confidence action per proposed_node.id. Duplicates mean inconsistent slugification produced different ids. |
To re-run a single batch: clear curator_processed_at and curator_run_id from the affected session log and re-run curate.
Privacy
CAUTION
Proposal logs contain the raw transcript. kenkeep does not scan or redact for secrets, so anything in the session appears verbatim.
Treat _logs/ with the same care as _sessions/. Both are gitignored by default.