Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
ee34f74
Add prompt_runtime setting to rules for Claude Code headless execution
claude Jan 21, 2026
4966050
Add automated tests for prompt_runtime and include transcript path in…
claude Jan 21, 2026
6118a89
Update readme and architecture accuracy rules to use claude runtime
claude Jan 21, 2026
ab89730
Add manual test for claude runtime feature
claude Jan 21, 2026
a9e774f
Add documentation and version bump for prompt_runtime feature
nhorton Jan 21, 2026
aa84c2c
Remove unnecessary prompt_runtime from command action rules
claude Jan 21, 2026
24df915
Add fallback for claude runtime in Claude Code Web environment
claude Jan 21, 2026
2cfd2b5
Update claude runtime test instructions
claude Jan 21, 2026
10b40d3
Add comprehensive tests for invoke_claude_headless execution
claude Jan 21, 2026
7f4b96b
Add SubagentStop hook support for Claude Code
claude Jan 21, 2026
99afdca
Merge branch 'main' into claude/add-subagent-stop-hook-ho4cv
nhorton Jan 21, 2026
de0d01f
Merge branch 'claude/add-subagent-stop-hook-ho4cv' into claude/add-pr…
nhorton Jan 21, 2026
5c60460
Merge origin/main into current branch
nhorton Jan 21, 2026
f81dc14
Register the subagentstop hook too
nhorton Jan 21, 2026
84ab01c
Merge main into claude/add-prompt-runtime-setting-gPJDA
nhorton Jan 22, 2026
da17c1d
Merge branch 'main' into claude/add-prompt-runtime-setting-gPJDA
nhorton Jan 22, 2026
8a9bd42
Merge branch 'main' into claude/add-prompt-runtime-setting-gPJDA
nhorton Jan 22, 2026
a8cbbe9
Merge branch 'main' into claude/add-prompt-runtime-setting-gPJDA
nhorton Jan 22, 2026
39b66d3
Merge branch 'main' into claude/add-prompt-runtime-setting-gPJDA
nhorton Jan 22, 2026
dcea5e4
Consolidate changelog into 0.4.0 release
nhorton Jan 22, 2026
be3ff71
Add investigation notes for Claude subprocess hanging issue
nhorton Jan 23, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .deepwork/rules/architecture-documentation-accuracy.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,12 @@ name: Architecture Documentation Accuracy
trigger: src/**/*
safety: doc/architecture.md
compare_to: base
prompt_runtime: claude
---
Source code in src/ has been modified. Please review doc/architecture.md for accuracy:
1. Verify the documented architecture matches the current implementation
2. Check that file paths and directory structures are still correct
3. Ensure component descriptions reflect actual behavior
4. Update any diagrams or flows that may have changed

If the architecture documentation needs updates, make the changes directly. If the documentation is accurate, confirm it matches the current implementation.
27 changes: 27 additions & 0 deletions .deepwork/rules/manual-test-claude-runtime.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
name: "Manual Test: Claude Runtime"
trigger: manual_tests/test_claude_runtime/test_claude_runtime_code.py
compare_to: prompt
prompt_runtime: claude
---

# Manual Test: Claude Runtime

You are evaluating code changes as part of an automated rule check.

**Review the code in the trigger file for:**
1. Basic code quality (clear variable names, proper structure)
2. Presence of docstrings or comments
3. No obvious bugs or issues

**This is a test rule.** For testing purposes:
- If the code looks reasonable, respond with `allow`
- If there are obvious issues (syntax errors, missing functions, etc.), respond with `block`

Since this is a manual test, the code is intentionally simple and should pass review.

## This tests:

The `prompt_runtime: claude` feature where instead of returning the prompt to
the triggering agent, Claude Code is invoked in headless mode to process
the rule autonomously.
1 change: 1 addition & 0 deletions .deepwork/rules/manual-test-created-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
name: "Manual Test: Created Mode"
created: manual_tests/test_created_mode/*.yml
compare_to: prompt
prompt_runtime: send_to_stopping_agent
---

# Manual Test: Created Mode (File Creation Trigger)
Expand Down
1 change: 1 addition & 0 deletions .deepwork/rules/manual-test-infinite-block-prompt.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
name: "Manual Test: Infinite Block Prompt"
trigger: manual_tests/test_infinite_block_prompt/test_infinite_block_prompt.py
compare_to: prompt
prompt_runtime: send_to_stopping_agent
---

# Manual Test: Infinite Block Prompt (Promise Required)
Expand Down
1 change: 1 addition & 0 deletions .deepwork/rules/manual-test-multi-safety.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ safety:
- manual_tests/test_multi_safety/test_multi_safety_changelog.md
- manual_tests/test_multi_safety/test_multi_safety_version.txt
compare_to: prompt
prompt_runtime: send_to_stopping_agent
---

# Manual Test: Multiple Safety Patterns
Expand Down
1 change: 1 addition & 0 deletions .deepwork/rules/manual-test-pair-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ pair:
trigger: manual_tests/test_pair_mode/test_pair_mode_trigger.py
expects: manual_tests/test_pair_mode/test_pair_mode_expected.md
compare_to: prompt
prompt_runtime: send_to_stopping_agent
---

# Manual Test: Pair Mode (Directional Correspondence)
Expand Down
1 change: 1 addition & 0 deletions .deepwork/rules/manual-test-set-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ set:
- manual_tests/test_set_mode/test_set_mode_source.py
- manual_tests/test_set_mode/test_set_mode_test.py
compare_to: prompt
prompt_runtime: send_to_stopping_agent
---

# Manual Test: Set Mode (Bidirectional Correspondence)
Expand Down
1 change: 1 addition & 0 deletions .deepwork/rules/manual-test-trigger-safety.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ name: "Manual Test: Trigger Safety"
trigger: manual_tests/test_trigger_safety_mode/test_trigger_safety_mode.py
safety: manual_tests/test_trigger_safety_mode/test_trigger_safety_mode_doc.md
compare_to: prompt
prompt_runtime: send_to_stopping_agent
---

# Manual Test: Trigger/Safety Mode
Expand Down
3 changes: 3 additions & 0 deletions .deepwork/rules/readme-accuracy.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,12 @@ name: README Accuracy
trigger: src/**/*
safety: README.md
compare_to: base
prompt_runtime: claude
---
Source code in src/ has been modified. Please review README.md for accuracy:
1. Verify project overview still reflects current functionality
2. Check that usage examples are still correct
3. Ensure installation/setup instructions remain valid
4. Update any sections that reference changed code

If the README needs updates, make the changes directly. If the README is accurate, confirm it matches the current implementation.
1 change: 1 addition & 0 deletions .deepwork/rules/skill-template-best-practices.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
name: Skill Template Best Practices
trigger: src/deepwork/templates/**/skill-job*.jinja
compare_to: prompt
prompt_runtime: send_to_stopping_agent
---
Skill template files are being modified. Ensure the generated skills follow these best practices:

Expand Down
1 change: 1 addition & 0 deletions .deepwork/rules/standard-jobs-source-of-truth.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ safety:
- src/deepwork/standard_jobs/deepwork_jobs/**/*
- src/deepwork/standard_jobs/deepwork_rules/**/*
compare_to: base
prompt_runtime: send_to_stopping_agent
---
You modified files in `.deepwork/jobs/deepwork_jobs/` or `.deepwork/jobs/deepwork_rules/`.

Expand Down
1 change: 1 addition & 0 deletions .deepwork/rules/version-and-changelog-update.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ safety:
- pyproject.toml
- CHANGELOG.md
compare_to: base
prompt_runtime: send_to_stopping_agent
---
Source code in src/ has been modified. **You MUST evaluate whether version and changelog updates are needed.**

Expand Down
92 changes: 38 additions & 54 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,73 +5,58 @@ All notable changes to DeepWork will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.5.2] - 2026-01-22

### Fixed
- Fixed COMMAND rules promise handling to properly update queue status
- When an agent provides a promise tag for a FAILED command rule, the queue entry is now correctly updated to SKIPPED status
- Previously, FAILED queue entries remained in FAILED state even after being acknowledged via promise
- This ensures the rules queue accurately reflects rule state throughout the workflow

## [0.5.1] - 2026-01-22

### Fixed
- Fixed quality criteria validation logic in skill template (#111)
- Changed promise condition from AND to OR: promise OR all criteria met now passes
- Changed failure condition from OR to AND: requires both criteria NOT met AND promise missing to fail
- This corrects the logic so the promise mechanism properly serves as a bypass for quality criteria

## [0.5.0] - 2026-01-20

### Changed
- **BREAKING**: Renamed `document_type` to `doc_spec` throughout the codebase
- Job.yml field: `document_type` → `doc_spec` (e.g., `outputs: [{file: "report.md", doc_spec: ".deepwork/doc_specs/report.md"}]`)
- Class: `DocumentTypeDefinition` → `DocSpec` (backward compat alias provided)
- Methods: `has_document_type()` → `has_doc_spec()`, `validate_document_type_references()` → `validate_doc_spec_references()`
- Template variables: `has_document_type` → `has_doc_spec`, `document_type` → `doc_spec`
- Internal: `_load_document_type()` → `_load_doc_spec()`, `_doc_type_cache` → `_doc_spec_cache`
## [0.4.0] - 2026-01-22

### Added
- Comprehensive tests for generator doc spec integration (9 new tests)
- `test_load_doc_spec_returns_parsed_spec` - Verifies doc spec loading
- `test_load_doc_spec_caches_result` - Verifies caching behavior
- `test_load_doc_spec_returns_none_for_missing_file` - Graceful handling of missing files
- `test_generate_step_skill_with_doc_spec` - End-to-end skill generation with doc spec
- `test_build_step_context_includes_doc_spec_info` - Context building verification

### Migration Guide
- Update job.yml files: Change `document_type:` to `doc_spec:` in output definitions
- Update any code importing `DocumentTypeDefinition`: Use `DocSpec` instead (alias still works)
- Run `deepwork install` to regenerate skills with updated terminology

## [0.4.0] - 2026-01-20

### Added
- Doc specs (document specifications) as a first-class feature for formalizing document quality criteria
- **Doc specs** (document specifications) as a first-class feature for formalizing document quality criteria
- New `src/deepwork/schemas/doc_spec_schema.py` with JSON schema validation
- New `src/deepwork/core/doc_spec_parser.py` with parser for frontmatter markdown doc spec files
- Doc spec files stored in `.deepwork/doc_specs/` directory with quality criteria and example documents
- Auto-creates `.deepwork/doc_specs/` directory during `deepwork install`
- Extended job.yml output schema to support doc spec references
- Outputs can now be strings (backward compatible) or objects with `file` and optional `doc_spec` fields
- Example: `outputs: [{file: "report.md", doc_spec: ".deepwork/doc_specs/monthly_report.md"}]`
- The `doc_spec` uses the full path to the doc spec file, making references self-documenting
- Doc spec-aware skill generation
- Step skills now include doc spec quality criteria, target audience, and example documents
- Both Claude and Gemini templates updated for doc spec rendering
- Document detection workflow in `deepwork_jobs.define`
- Steps 1.5, 1.6, 1.7 guide users through creating doc specs for document-oriented jobs
- Pattern indicators: "report", "summary", "create", "monthly", "for stakeholders"
- Doc spec improvement workflow in `deepwork_jobs.learn`
- Steps 3.5, 4.5 capture doc spec-related learnings and update doc spec files
- New `OutputSpec` dataclass in parser for structured output handling
- Comprehensive doc spec documentation in `doc/doc-specs.md`
- New test fixtures for doc spec validation and parsing
- Doc spec-aware skill generation with quality criteria, target audience, and example documents
- **`prompt_runtime` setting** for rules to control how prompt-type actions are executed
- `send_to_stopping_agent` (default): Returns prompt to the agent that triggered the rule
- `claude`: Invokes Claude Code in headless mode to handle the rule independently
- Claude headless mode execution for automated rule remediation
- Rules with `prompt_runtime: claude` spawn a separate Claude process
- Claude performs required actions and returns structured `block`/`allow` decision
- Useful for automated tasks like documentation updates without blocking the main agent
- **`deepwork rules clear_queue` CLI command** for managing the rules queue (#117)
- Clears all entries from the rules queue to reset state
- Code review stage added to the `commit` standard job (#99)
- New `commit.review` step runs before testing to catch issues early
- Session start hook for version checking (#106)
- Manual tests job for validating hook/rule behavior (#102)

### Changed
- **BREAKING**: Renamed `document_type` to `doc_spec` throughout the codebase
- Job.yml field: `document_type` → `doc_spec`
- Class: `DocumentTypeDefinition` → `DocSpec` (backward compat alias provided)
- Methods: `has_document_type()` → `has_doc_spec()`, `validate_document_type_references()` → `validate_doc_spec_references()`
- `Step.outputs` changed from `list[str]` to `list[OutputSpec]` for richer output metadata
- `SkillGenerator.generate_all_skills()` now accepts `project_root` parameter for doc spec loading
- Updated `deepwork_jobs` to v0.6.0 with doc spec-related quality criteria
- Skill template documentation now uses generic "agent" terminology (#115)

### Fixed
- Fixed infinite loop bug in rules system when promise tags weren't recognized (#96)
- Rules now properly detect and honor promise acknowledgments
- Fixed COMMAND rules promise handling to properly update queue status (#120)
- FAILED queue entries now correctly update to SKIPPED when acknowledged via promise
- Fixed quality criteria validation logic in skill template (#113)
- Promise OR all criteria met now passes (was incorrectly AND)
- Requires both criteria NOT met AND promise missing to fail
- Fixed `compare_to: prompt` mode not detecting committed files during agent response (#95)
- Rules now search prompts for directory references
- Added timeout to deepwork install hook (#101)

### Migration Guide
- Update job.yml files: Change `document_type:` to `doc_spec:` in output definitions
- Update any code importing `DocumentTypeDefinition`: Use `DocSpec` instead (alias still works)
- Run `deepwork install` to regenerate skills with updated terminology

## [0.3.1] - 2026-01-20

Expand Down Expand Up @@ -180,7 +165,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

Initial version.

[0.5.0]: https://github.com/anthropics/deepwork/releases/tag/0.5.0
[0.4.0]: https://github.com/anthropics/deepwork/releases/tag/0.4.0
[0.3.1]: https://github.com/anthropics/deepwork/releases/tag/0.3.1
[0.3.0]: https://github.com/anthropics/deepwork/releases/tag/0.3.0
Expand Down
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -313,6 +313,21 @@ compare_to: prompt
---
```

**Example Rule with Claude Runtime** (`.deepwork/rules/readme-accuracy.md`):
```markdown
---
name: README Accuracy
trigger: "src/**/*.py"
compare_to: prompt
prompt_runtime: claude
---
Source code has been modified. Review README.md for accuracy and update if needed.
```

The `prompt_runtime` setting controls how prompt-based rules are executed:
- `send_to_stopping_agent` (default): Returns the rule prompt to the agent that triggered it
- `claude`: Invokes Claude Code in headless mode to evaluate the rule independently

### Multi-Platform Support
Generate native commands and skills tailored for your AI coding assistant.
- **Native Integration**: Works directly with the skill/command formats of supported agents.
Expand Down
70 changes: 61 additions & 9 deletions doc/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -1043,7 +1043,7 @@ Please create or update tests for the modified source files.

### Detection Modes

Rules support three detection modes:
Rules support four detection modes:

**1. Trigger/Safety (default)** - Fire when trigger matches but safety doesn't:
```yaml
Expand Down Expand Up @@ -1078,6 +1078,16 @@ compare_to: base
---
```

**4. Created** - Fire when newly created files match patterns:
```yaml
---
name: New Component Checklist
created: "src/components/**/*.tsx"
compare_to: base
---
```
This mode triggers only for files that are newly created (not modified), useful for enforcing standards on new files.

### Action Types

**1. Prompt (default)** - Show instructions to the agent:
Expand All @@ -1102,6 +1112,42 @@ compare_to: prompt
---
```

### Prompt Runtime

For prompt-type actions, you can specify how the prompt is delivered using the `prompt_runtime` setting:

**1. send_to_stopping_agent (default)** - Return the prompt to the agent that triggered the rule:
```yaml
---
name: Security Review
trigger: "src/auth/**/*"
compare_to: base
prompt_runtime: send_to_stopping_agent
---
Please check for hardcoded credentials and validate input.
```

**2. claude** - Invoke Claude Code in headless mode to handle the rule:
```yaml
---
name: Architecture Documentation Accuracy
trigger: "src/deepwork/core/**/*.py"
safety: "doc/architecture.md"
compare_to: base
prompt_runtime: claude
---
Review doc/architecture.md for accuracy against the current implementation.
```

When `prompt_runtime: claude` is set, the rule evaluation:
1. Spawns a separate Claude Code process in headless mode
2. Passes the rule instructions as a prompt
3. Claude performs the required actions (e.g., updating documentation)
4. Returns a structured `block` or `allow` decision
5. If `allow`, the rule is marked as passed without blocking the original agent

This is useful for automated remediation tasks that don't require user interaction.

### Rule Evaluation Flow

1. **Session Start**: When a Claude Code session begins, the baseline git state is captured
Expand Down Expand Up @@ -1289,15 +1335,21 @@ See `doc/doc-specs.md` for complete documentation.

### Rule Schema

Rules are validated against a JSON Schema:
Rules are validated against a JSON Schema. The frontmatter supports these fields:

```yaml
- name: string # Required: Friendly name for the rule
trigger: string|array # Required: Glob pattern(s) for triggering files
safety: string|array # Optional: Glob pattern(s) for safety files
instructions: string # Required (unless instructions_file): What to do
instructions_file: string # Alternative: Path to instructions file
```
| Field | Required | Description |
|-------|----------|-------------|
| `name` | Yes | Human-friendly name for the rule (displayed in promise tags) |
| `compare_to` | Yes | Baseline for detecting file changes: `base`, `default_tip`, or `prompt` |
| `trigger` | One mode required | Glob pattern(s) for triggering files (trigger/safety mode) |
| `safety` | No | Glob pattern(s) that suppress the rule if changed |
| `set` | One mode required | Array of patterns for bidirectional correspondence |
| `pair` | One mode required | Object with `trigger` and `expects` for directional correspondence |
| `created` | One mode required | Glob pattern(s) for newly created files |
| `action` | No | Object with `command` and optional `run_for` for command actions |
| `prompt_runtime` | No | `send_to_stopping_agent` (default) or `claude` for headless execution |

The markdown body after the frontmatter contains the instructions for prompt-type rules.

### Defining Rules

Expand Down
Loading
Loading