diff --git a/.claude/skills/add_platform.add_capabilities/SKILL.md b/.claude/skills/add_platform.add_capabilities/SKILL.md deleted file mode 100644 index 17359135..00000000 --- a/.claude/skills/add_platform.add_capabilities/SKILL.md +++ /dev/null @@ -1,216 +0,0 @@ ---- -name: add_platform.add_capabilities -description: "Updates job schema and adapters with any new hook events the platform supports. Use after research to extend DeepWork's hook system." -user-invocable: false - ---- - -# add_platform.add_capabilities - -**Step 2/4** in **integrate** workflow - -> Full workflow to integrate a new AI platform into DeepWork - -> Adds a new AI platform to DeepWork with adapter, templates, and tests. Use when integrating Cursor, Windsurf, or other AI coding tools. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/add_platform.research` - -## Instructions - -**Goal**: Updates job schema and adapters with any new hook events the platform supports. Use after research to extend DeepWork's hook system. - -# Add Hook Capabilities - -## Objective - -Update the DeepWork job schema and platform adapters to support any new hook events that the new platform provides for slash command definitions. - -## Task - -Analyze the hooks documentation from the research step and update the codebase to support any new hook capabilities, ensuring consistency across all existing adapters. - -### Prerequisites - -Read the hooks documentation created in the previous step: -- `doc/platforms//hooks_system.md` - -Also review the existing schema and adapters: -- `src/deepwork/schemas/job_schema.py` -- `src/deepwork/adapters.py` - -### Process - -1. **Analyze the new platform's hooks** - - Read `doc/platforms//hooks_system.md` - - List all hooks available for slash command definitions - - Compare with hooks already in `job_schema.py` - - Identify any NEW hooks not currently supported - -2. **Determine if schema changes are needed** - - If the platform has hooks that DeepWork doesn't currently support, add them - - If all hooks are already supported, document this finding - - Remember: Only add hooks that are available on slash command definitions - -3. **Update job_schema.py (if needed)** - - Add new hook fields to the step schema - - Follow existing patterns for hook definitions - - Add appropriate type hints and documentation - - Example addition: - ```python - # New hook from - new_hook_name: Optional[List[HookConfig]] = None - ``` - -4. **Update all existing adapters** - - Open `src/deepwork/adapters.py` - - For EACH existing adapter class: - - Add the new hook field (set to `None` if not supported) - - This maintains consistency across all adapters - - Document why each adapter does or doesn't support the hook - -5. **Validate the changes** - - Run Python syntax check: `python -m py_compile src/deepwork/schemas/job_schema.py` - - Run Python syntax check: `python -m py_compile src/deepwork/adapters.py` - - Ensure no import errors - -6. **Document the decision** - - If no new hooks were added, add a comment explaining why - - If new hooks were added, ensure they're documented in the schema - -## Output Format - -### job_schema.py - -Location: `src/deepwork/schemas/job_schema.py` - -If new hooks are added: -```python -@dataclass -class StepDefinition: - # ... existing fields ... - - # New hook from - [description of what it does] - new_hook_name: Optional[List[HookConfig]] = None -``` - -### adapters.py - -Location: `src/deepwork/adapters.py` - -For each existing adapter, add the new hook field: -```python -class ExistingPlatformAdapter(PlatformAdapter): - # ... existing code ... - - def get_hook_support(self) -> dict: - return { - # ... existing hooks ... - "new_hook_name": None, # Not supported by this platform - } -``` - -Or if no changes are needed, add a documentation comment: -```python -# NOTE: hooks reviewed on YYYY-MM-DD -# No new hooks to add - all command hooks are already -# supported by the existing schema (stop_hooks covers their validation pattern) -``` - -## Quality Criteria - -- Hooks documentation from research step has been reviewed -- If new hooks exist: - - Added to `src/deepwork/schemas/job_schema.py` with proper typing - - ALL existing adapters updated in `src/deepwork/adapters.py` - - Each adapter indicates support level (implemented, None, or partial) -- If no new hooks needed: - - Decision documented with a comment explaining the analysis -- Only hooks available on slash command definitions are considered -- `job_schema.py` has no syntax errors (verified with py_compile) -- `adapters.py` has no syntax errors (verified with py_compile) -- All adapters have consistent hook fields (same fields across all adapters) -- When all criteria are met, include `✓ Quality Criteria Met` in your response - -## Context - -DeepWork supports multiple AI platforms, and each platform may have different capabilities for hooks within command definitions. The schema defines what hooks CAN exist, while adapters define what each platform actually SUPPORTS. - -This separation allows: -- Job definitions to use any hook (the schema is the superset) -- Platform-specific generation to only use supported hooks (adapters filter) -- Future platforms to add new hooks without breaking existing ones - -Maintaining consistency is critical - all adapters must have the same hook fields, even if they don't support them (use `None` for unsupported). - -## Common Hook Types - -For reference, here are common hook patterns across platforms: - -| Hook Type | Purpose | Example Platforms | -|-----------|---------|-------------------| -| `stop_hooks` | Quality validation loops | Claude Code | -| `pre_hooks` | Run before command | Various | -| `post_hooks` | Run after command | Various | -| `validation_hooks` | Validate inputs/outputs | Various | - -When you find a new hook type, consider whether it maps to an existing pattern or is genuinely new functionality. - - -### Job Context - -A workflow for adding support for a new AI platform (like Cursor, Windsurf, etc.) to DeepWork. - -The **integrate** workflow guides you through four phases: -1. **Research**: Capture the platform's CLI configuration and hooks system documentation -2. **Add Capabilities**: Update the job schema and adapters with any new hook events -3. **Implement**: Create the platform adapter, templates, tests (100% coverage), and README updates -4. **Verify**: Ensure installation works correctly and produces expected files - -The workflow ensures consistency across all supported platforms and maintains -comprehensive test coverage for new functionality. - -**Important Notes**: -- Only hooks available on slash command definitions should be captured -- Each existing adapter must be updated when new hooks are added (typically with null values) -- Tests must achieve 100% coverage for any new functionality -- Installation verification confirms the platform integrates correctly with existing jobs - - -## Required Inputs - - -**Files from Previous Steps** - Read these first: -- `hooks_system.md` (from `research`) - -## Work Branch - -Use branch format: `deepwork/add_platform-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/add_platform-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `job_schema.py` -- `adapters.py` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## On Completion - -1. Verify outputs are created -2. Inform user: "integrate step 2/4 complete, outputs: job_schema.py, adapters.py" -3. **Continue workflow**: Use Skill tool to invoke `/add_platform.implement` - ---- - -**Reference files**: `.deepwork/jobs/add_platform/job.yml`, `.deepwork/jobs/add_platform/steps/add_capabilities.md` \ No newline at end of file diff --git a/.claude/skills/add_platform.implement/SKILL.md b/.claude/skills/add_platform.implement/SKILL.md deleted file mode 100644 index 5d925402..00000000 --- a/.claude/skills/add_platform.implement/SKILL.md +++ /dev/null @@ -1,321 +0,0 @@ ---- -name: add_platform.implement -description: "Creates platform adapter, templates, tests with 100% coverage, and README documentation. Use after adding hook capabilities." -user-invocable: false -hooks: - Stop: - - hooks: - - type: command - command: ".deepwork/jobs/add_platform/hooks/run_tests.sh" - SubagentStop: - - hooks: - - type: command - command: ".deepwork/jobs/add_platform/hooks/run_tests.sh" - ---- - -# add_platform.implement - -**Step 3/4** in **integrate** workflow - -> Full workflow to integrate a new AI platform into DeepWork - -> Adds a new AI platform to DeepWork with adapter, templates, and tests. Use when integrating Cursor, Windsurf, or other AI coding tools. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/add_platform.research` -- `/add_platform.add_capabilities` - -## Instructions - -**Goal**: Creates platform adapter, templates, tests with 100% coverage, and README documentation. Use after adding hook capabilities. - -# Implement Platform Support - -## Objective - -Create the complete platform implementation including the adapter class, command templates, comprehensive tests, and documentation updates. - -## Task - -Build the full platform support by implementing the adapter, creating templates, writing tests with 100% coverage, and updating the README. - -### Prerequisites - -Read the outputs from previous steps: -- `doc/platforms//cli_configuration.md` - For template structure -- `src/deepwork/schemas/job_schema.py` - For current schema -- `src/deepwork/adapters.py` - For adapter patterns - -Also review existing implementations for reference: -- `src/deepwork/templates/claude/` - Example templates -- `tests/` - Existing test patterns - -### Process - -1. **Create the platform adapter class** - - Add a new adapter class to `src/deepwork/adapters.py`: - - ```python - class NewPlatformAdapter(PlatformAdapter): - """Adapter for .""" - - platform_name = "" - command_directory = "" # e.g., ".cursor/commands" - command_extension = ".md" # or appropriate extension - - def get_hook_support(self) -> dict: - """Return which hooks this platform supports.""" - return { - "stop_hooks": True, # or False/None - # ... other hooks - } - - def generate_command(self, step: StepDefinition, job: JobDefinition) -> str: - """Generate command file content for this platform.""" - # Use Jinja2 template - template = self.env.get_template(f"{self.platform_name}/command.md.j2") - return template.render(step=step, job=job) - ``` - -2. **Create command templates** - - Create templates in `src/deepwork/templates//`: - - - `command.md.j2` - Main command template - - Any other templates needed for the platform's format - - Use the CLI configuration documentation to ensure the template matches the platform's expected format. - -3. **Register the adapter** - - Update the adapter registry in `src/deepwork/adapters.py`: - - ```python - PLATFORM_ADAPTERS = { - "claude": ClaudeAdapter, - "": NewPlatformAdapter, - # ... other adapters - } - ``` - -4. **Write comprehensive tests** - - Create tests in `tests/` that cover: - - - Adapter instantiation - - Hook support detection - - Command generation - - Template rendering - - Edge cases (empty inputs, special characters, etc.) - - Integration with the sync command - - **Critical**: Tests must achieve 100% coverage of new code. - -5. **Update README.md** - - Add the new platform to `README.md`: - - - Add to "Supported Platforms" list - - Add installation instructions: - ```bash - deepwork install --platform - ``` - - Document any platform-specific notes or limitations - -6. **Run tests and verify coverage** - - ```bash - uv run pytest --cov=src/deepwork --cov-report=term-missing - ``` - - - All tests must pass - - New code must have 100% coverage - - If coverage is below 100%, add more tests - -7. **Iterate until tests pass with full coverage** - - This step has a `stop_hooks` script that runs tests. Keep iterating until: - - All tests pass - - Coverage is 100% for new functionality - -## Output Format - -### templates/ - -Location: `src/deepwork/templates//` - -Create the following files: - -**command.md.j2**: -```jinja2 -{# Template for command files #} -{# Follows the platform's expected format from cli_configuration.md #} - -[Platform-specific frontmatter or metadata] - -# {{ step.name }} - -{{ step.description }} - -## Instructions - -{{ step.instructions_content }} - -[... rest of template based on platform format ...] -``` - -### tests/ - -Location: `tests/test__adapter.py` - -```python -"""Tests for the adapter.""" -import pytest -from deepwork.adapters import NewPlatformAdapter - -class TestNewPlatformAdapter: - """Test suite for NewPlatformAdapter.""" - - def test_adapter_initialization(self): - """Test adapter can be instantiated.""" - adapter = NewPlatformAdapter() - assert adapter.platform_name == "" - - def test_hook_support(self): - """Test hook support detection.""" - adapter = NewPlatformAdapter() - hooks = adapter.get_hook_support() - assert "stop_hooks" in hooks - # ... more assertions - - def test_command_generation(self): - """Test command file generation.""" - # ... test implementation - - # ... more tests for 100% coverage -``` - -### README.md - -Add to the existing README.md: - -```markdown -## Supported Platforms - -- **Claude Code** - Anthropic's CLI for Claude -- **** - [Brief description] - -## Installation - -### - -```bash -deepwork install --platform -``` - -[Any platform-specific notes] -``` - -## Quality Criteria - -- Platform adapter class added to `src/deepwork/adapters.py`: - - Inherits from `PlatformAdapter` - - Implements all required methods - - Registered in `PLATFORM_ADAPTERS` -- Templates created in `src/deepwork/templates//`: - - `command.md.j2` exists and renders correctly - - Format matches platform's expected command format -- Tests created in `tests/`: - - Cover all new adapter functionality - - Cover template rendering - - All tests pass -- Test coverage is 100% for new code: - - Run `uv run pytest --cov=src/deepwork --cov-report=term-missing` - - No uncovered lines in new code -- README.md updated: - - Platform listed in supported platforms - - Installation command documented - - Any platform-specific notes included -- When all criteria are met, include `✓ Quality Criteria Met` in your response - -## Context - -This is the core implementation step. The adapter you create will be responsible for: -- Determining where command files are placed -- Generating command file content from job definitions -- Handling platform-specific features and hooks - -The templates use Jinja2 and should produce files that match exactly what the platform expects. Reference the CLI configuration documentation frequently to ensure compatibility. - -## Tips - -- Study the existing `ClaudeAdapter` as a reference implementation -- Run tests frequently as you implement -- Use `--cov-report=html` for a detailed coverage report -- If a test is hard to write, the code might need refactoring -- Template syntax errors often show up at runtime - test early - - -### Job Context - -A workflow for adding support for a new AI platform (like Cursor, Windsurf, etc.) to DeepWork. - -The **integrate** workflow guides you through four phases: -1. **Research**: Capture the platform's CLI configuration and hooks system documentation -2. **Add Capabilities**: Update the job schema and adapters with any new hook events -3. **Implement**: Create the platform adapter, templates, tests (100% coverage), and README updates -4. **Verify**: Ensure installation works correctly and produces expected files - -The workflow ensures consistency across all supported platforms and maintains -comprehensive test coverage for new functionality. - -**Important Notes**: -- Only hooks available on slash command definitions should be captured -- Each existing adapter must be updated when new hooks are added (typically with null values) -- Tests must achieve 100% coverage for any new functionality -- Installation verification confirms the platform integrates correctly with existing jobs - - -## Required Inputs - - -**Files from Previous Steps** - Read these first: -- `job_schema.py` (from `add_capabilities`) -- `adapters.py` (from `add_capabilities`) -- `cli_configuration.md` (from `research`) - -## Work Branch - -Use branch format: `deepwork/add_platform-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/add_platform-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `templates/` (directory) -- `tests/` (directory) -- `README.md` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -**Validation script**: `.deepwork/jobs/add_platform/hooks/run_tests.sh` (runs automatically) -## On Completion - -1. Verify outputs are created -2. Inform user: "integrate step 3/4 complete, outputs: templates/, tests/, README.md" -3. **Continue workflow**: Use Skill tool to invoke `/add_platform.verify` - ---- - -**Reference files**: `.deepwork/jobs/add_platform/job.yml`, `.deepwork/jobs/add_platform/steps/implement.md` \ No newline at end of file diff --git a/.claude/skills/add_platform.research/SKILL.md b/.claude/skills/add_platform.research/SKILL.md deleted file mode 100644 index d113d3e9..00000000 --- a/.claude/skills/add_platform.research/SKILL.md +++ /dev/null @@ -1,265 +0,0 @@ ---- -name: add_platform.research -description: "Captures CLI configuration and hooks system documentation for the new platform. Use when starting platform integration." -user-invocable: false - ---- - -# add_platform.research - -**Step 1/4** in **integrate** workflow - -> Full workflow to integrate a new AI platform into DeepWork - -> Adds a new AI platform to DeepWork with adapter, templates, and tests. Use when integrating Cursor, Windsurf, or other AI coding tools. - - -## Instructions - -**Goal**: Captures CLI configuration and hooks system documentation for the new platform. Use when starting platform integration. - -# Research Platform Documentation - -## Objective - -Capture comprehensive documentation for the new AI platform's CLI configuration and hooks system, creating a local reference that will guide the implementation phases. - -## Task - -Research the target platform's official documentation and create two focused documentation files that will serve as the foundation for implementing platform support in DeepWork. - -### Process - -1. **Identify the platform's documentation sources** - - Find the official documentation website - - Locate the CLI/agent configuration documentation - - Find the hooks or customization system documentation - - Note: Focus ONLY on slash command/custom command hooks, not general CLI hooks - -2. **Gather CLI configuration documentation** - - How is the CLI configured? (config files, environment variables, etc.) - - Where are custom commands/skills stored? - - What is the command file format? (markdown, YAML, etc.) - - What metadata or frontmatter is supported? - - How does the platform discover and load commands? - -3. **Gather hooks system documentation** - - What hooks are available for custom command definitions? - - Focus on hooks that trigger during or after command execution - - Examples: `stop_hooks`, `pre_hooks`, `post_hooks`, validation hooks - - Document the syntax and available hook types - - **Important**: Only document hooks available on slash command definitions, not general CLI hooks - -4. **Create the documentation files** - - Place files in `doc/platforms//` - - Each file must have a header comment with source and date - - Content should be comprehensive but focused - -## Output Format - -### cli_configuration.md - -Located at: `doc/platforms//cli_configuration.md` - -**Structure**: -```markdown - - -# CLI Configuration - -## Overview - -[Brief description of the platform and its CLI/agent system] - -## Configuration Files - -[Document where configuration lives and its format] - -### File Locations - -- [Location 1]: [Purpose] -- [Location 2]: [Purpose] - -### Configuration Format - -[Show the configuration file format with examples] - -## Custom Commands/Skills - -[Document how custom commands are defined] - -### Command Location - -[Where command files are stored] - -### Command File Format - -[The format of command files - markdown, YAML, etc.] - -### Metadata/Frontmatter - -[What metadata fields are supported in command files] - -```[format] -[Example of a minimal command file] -``` - -## Command Discovery - -[How the platform discovers and loads commands] - -## Platform-Specific Features - -[Any unique features relevant to command configuration] -``` - -### hooks_system.md - -Located at: `doc/platforms//hooks_system.md` - -**Structure**: -```markdown - - -# Hooks System (Command Definitions) - -## Overview - -[Brief description of hooks available for command definitions] - -**Important**: This document covers ONLY hooks available within slash command/skill definitions, not general CLI hooks. - -## Available Hooks - -### [Hook Name 1] - -**Purpose**: [What this hook does] - -**Syntax**: -```yaml -[hook_name]: - - [configuration] -``` - -**Example**: -```yaml -[Complete example of using this hook] -``` - -**Behavior**: [When and how this hook executes] - -### [Hook Name 2] - -[Repeat for each available hook] - -## Hook Execution Order - -[Document the order in which hooks execute, if multiple are supported] - -## Comparison with Other Platforms - -| Feature | | Claude Code | Other | -|---------|-----------|-------------|-------| -| [Feature 1] | [Support] | [Support] | [Support] | - -## Limitations - -[Any limitations or caveats about the hooks system] -``` - -## Quality Criteria - -- Both files exist in `doc/platforms//` -- Each file has a header comment with: - - Last updated date (YYYY-MM-DD format) - - Source URL where documentation was obtained -- `cli_configuration.md` comprehensively covers: - - Configuration file locations and format - - Custom command file format and location - - Command discovery mechanism -- `hooks_system.md` comprehensively covers: - - All hooks available for slash command definitions - - Syntax and examples for each hook - - NOT general CLI hooks (only command-level hooks) -- Documentation is detailed enough to implement the platform adapter -- No extraneous topics (only CLI config and command hooks) -- When all criteria are met, include `✓ Quality Criteria Met` in your response - -## Context - -This is the foundation step for adding a new platform to DeepWork. The documentation you capture here will be referenced throughout the implementation process: -- CLI configuration informs how to generate command files -- Hooks documentation determines what features the adapter needs to support -- This documentation becomes a permanent reference in `doc/platforms/` - -Take time to be thorough - incomplete documentation will slow down subsequent steps. - -## Tips - -- Use the platform's official documentation as the primary source -- If documentation is sparse, check GitHub repos, community guides, or changelog entries -- When in doubt about whether something is a "command hook" vs "CLI hook", err on the side of inclusion and note the ambiguity -- Include code examples from the official docs where available - - -### Job Context - -A workflow for adding support for a new AI platform (like Cursor, Windsurf, etc.) to DeepWork. - -The **integrate** workflow guides you through four phases: -1. **Research**: Capture the platform's CLI configuration and hooks system documentation -2. **Add Capabilities**: Update the job schema and adapters with any new hook events -3. **Implement**: Create the platform adapter, templates, tests (100% coverage), and README updates -4. **Verify**: Ensure installation works correctly and produces expected files - -The workflow ensures consistency across all supported platforms and maintains -comprehensive test coverage for new functionality. - -**Important Notes**: -- Only hooks available on slash command definitions should be captured -- Each existing adapter must be updated when new hooks are added (typically with null values) -- Tests must achieve 100% coverage for any new functionality -- Installation verification confirms the platform integrates correctly with existing jobs - - -## Required Inputs - -**User Parameters** - Gather from user before starting: -- **platform_name**: Clear identifier of the platform (e.g., 'cursor', 'windsurf-editor', 'github-copilot-chat') - - -## Work Branch - -Use branch format: `deepwork/add_platform-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/add_platform-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `cli_configuration.md` -- `hooks_system.md` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## On Completion - -1. Verify outputs are created -2. Inform user: "integrate step 1/4 complete, outputs: cli_configuration.md, hooks_system.md" -3. **Continue workflow**: Use Skill tool to invoke `/add_platform.add_capabilities` - ---- - -**Reference files**: `.deepwork/jobs/add_platform/job.yml`, `.deepwork/jobs/add_platform/steps/research.md` \ No newline at end of file diff --git a/.claude/skills/add_platform.verify/SKILL.md b/.claude/skills/add_platform.verify/SKILL.md deleted file mode 100644 index 27101223..00000000 --- a/.claude/skills/add_platform.verify/SKILL.md +++ /dev/null @@ -1,191 +0,0 @@ ---- -name: add_platform.verify -description: "Sets up platform directories and verifies deepwork install works correctly. Use after implementation to confirm integration." -user-invocable: false - ---- - -# add_platform.verify - -**Step 4/4** in **integrate** workflow - -> Full workflow to integrate a new AI platform into DeepWork - -> Adds a new AI platform to DeepWork with adapter, templates, and tests. Use when integrating Cursor, Windsurf, or other AI coding tools. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/add_platform.implement` - -## Instructions - -**Goal**: Sets up platform directories and verifies deepwork install works correctly. Use after implementation to confirm integration. - -# Verify Installation - -## Objective - -Ensure the new platform integration works correctly by setting up necessary directories and running the full installation process. - -## Task - -Perform end-to-end verification that the new platform can be installed and that DeepWork's standard jobs work correctly with it. - -### Prerequisites - -Ensure the implementation step is complete: -- Adapter class exists in `src/deepwork/adapters.py` -- Templates exist in `src/deepwork/templates//` -- Tests pass with 100% coverage -- README.md is updated - -### Process - -1. **Set up platform directories in the DeepWork repo** - - The DeepWork repository itself should have the platform's command directory structure for testing: - - ```bash - mkdir -p - ``` - - For example: - - Claude: `.claude/commands/` - - Cursor: `.cursor/commands/` (or wherever Cursor stores commands) - -2. **Run deepwork install for the new platform** - - ```bash - deepwork install --platform - ``` - - Verify: - - Command completes without errors - - No Python exceptions or tracebacks - - Output indicates successful installation - -3. **Check that command files were created** - - List the generated command files: - ```bash - ls -la / - ``` - - Verify: - - `deepwork_jobs.define.md` exists (or equivalent for the platform) - - `deepwork_jobs.implement.md` exists - - `deepwork_jobs.refine.md` exists - - `deepwork_rules.define.md` exists - - All expected step commands exist - -4. **Validate command file content** - - Read each generated command file and verify: - - Content matches the expected format for the platform - - Job metadata is correctly included - - Step instructions are properly rendered - - Any platform-specific features (hooks, frontmatter) are present - -5. **Test alongside existing platforms** - - If other platforms are already installed, verify they still work: - ```bash - deepwork install --platform claude - ls -la .claude/commands/ - ``` - - Ensure: - - New platform doesn't break existing installations - - Each platform's commands are independent - - No file conflicts or overwrites - -## Quality Criteria - -- Platform-specific directories are set up in the DeepWork repo -- `deepwork install --platform ` completes without errors -- All expected command files are created: - - deepwork_jobs.define, implement, refine - - deepwork_rules.define - - Any other standard job commands -- Command file content is correct: - - Matches platform's expected format - - Job/step information is properly rendered - - No template errors or missing content -- Existing platforms still work (if applicable) -- No conflicts between platforms -- When all criteria are met, include `✓ Quality Criteria Met` in your response - -## Context - -This is the final validation step before the platform is considered complete. A thorough verification ensures: -- The platform actually works, not just compiles -- Standard DeepWork jobs install correctly -- The platform integrates properly with the existing system -- Users can confidently use the new platform - -Take time to verify each aspect - finding issues now is much better than having users discover them later. - -## Common Issues to Check - -- **Template syntax errors**: May only appear when rendering specific content -- **Path issues**: Platform might expect different directory structure -- **Encoding issues**: Special characters in templates or content -- **Missing hooks**: Platform adapter might not handle all hook types -- **Permission issues**: Directory creation might fail in some cases - - -### Job Context - -A workflow for adding support for a new AI platform (like Cursor, Windsurf, etc.) to DeepWork. - -The **integrate** workflow guides you through four phases: -1. **Research**: Capture the platform's CLI configuration and hooks system documentation -2. **Add Capabilities**: Update the job schema and adapters with any new hook events -3. **Implement**: Create the platform adapter, templates, tests (100% coverage), and README updates -4. **Verify**: Ensure installation works correctly and produces expected files - -The workflow ensures consistency across all supported platforms and maintains -comprehensive test coverage for new functionality. - -**Important Notes**: -- Only hooks available on slash command definitions should be captured -- Each existing adapter must be updated when new hooks are added (typically with null values) -- Tests must achieve 100% coverage for any new functionality -- Installation verification confirms the platform integrates correctly with existing jobs - - -## Required Inputs - - -**Files from Previous Steps** - Read these first: -- `templates/` (from `implement`) - -## Work Branch - -Use branch format: `deepwork/add_platform-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/add_platform-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `verification_checklist.md` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## On Completion - -1. Verify outputs are created -2. Inform user: "integrate step 4/4 complete, outputs: verification_checklist.md" -3. **integrate workflow complete**: All steps finished. Consider creating a PR to merge the work branch. - ---- - -**Reference files**: `.deepwork/jobs/add_platform/job.yml`, `.deepwork/jobs/add_platform/steps/verify.md` \ No newline at end of file diff --git a/.claude/skills/add_platform/SKILL.md b/.claude/skills/add_platform/SKILL.md index 474bb3a8..6ba33119 100644 --- a/.claude/skills/add_platform/SKILL.md +++ b/.claude/skills/add_platform/SKILL.md @@ -3,12 +3,10 @@ name: add_platform description: "Adds a new AI platform to DeepWork with adapter, templates, and tests. Use when integrating Cursor, Windsurf, or other AI coding tools." --- -# add_platform +# add_platform Agent Adds a new AI platform to DeepWork with adapter, templates, and tests. Use when integrating Cursor, Windsurf, or other AI coding tools. -> **CRITICAL**: Always invoke steps using the Skill tool. Never copy/paste step instructions directly. - A workflow for adding support for a new AI platform (like Cursor, Windsurf, etc.) to DeepWork. The **integrate** workflow guides you through four phases: @@ -27,57 +25,867 @@ comprehensive test coverage for new functionality. - Installation verification confirms the platform integrates correctly with existing jobs -## Workflows +## Agent Overview + +This agent handles the **add_platform** job with 4 skills. + +**Workflows**: integrate +--- + +## How to Use This Agent + +### Workflows +- **integrate**: Full workflow to integrate a new AI platform into DeepWork (research → add_capabilities → implement → verify) + - Start: `research` + +### All Skills +- `research` - Captures CLI configuration and hooks system documentation for the new platform. Use when starting platform integration. +- `add_capabilities` - Updates job schema and adapters with any new hook events the platform supports. Use after research to extend DeepWork's hook system. +- `implement` - Creates platform adapter, templates, tests with 100% coverage, and README documentation. Use after adding hook capabilities. +- `verify` - Sets up platform directories and verifies deepwork install works correctly. Use after implementation to confirm integration. + +--- + +## Agent Execution Instructions + +When invoked, follow these steps: + +### Step 1: Understand Intent + +Parse the user's request to determine: +1. Which workflow or skill to execute +2. Any parameters or context provided +3. Whether this is a continuation of previous work + +### Step 2: Check Work Branch + +Before executing any skill: +1. Check current git branch +2. If on a `deepwork/add_platform-*` branch: continue using it +3. If on main/master: create new branch `deepwork/add_platform-[instance]-$(date +%Y%m%d)` + +### Step 3: Execute the Appropriate Skill + +Navigate to the relevant skill section below and follow its instructions. + +### Step 4: Workflow Continuation + +After completing a workflow step: +1. Inform the user of completion and outputs created +2. Automatically proceed to the next step if one exists +3. Continue until the workflow is complete or the user intervenes + +--- + +## Skills + +### Skill: research + +**Type**: Workflow step 1/4 in **integrate** + +**Description**: Captures CLI configuration and hooks system documentation for the new platform. Use when starting platform integration. + + +#### Required User Input + +Gather these from the user before starting: +- **platform_name**: Clear identifier of the platform (e.g., 'cursor', 'windsurf-editor', 'github-copilot-chat') + + +#### Instructions + +# Research Platform Documentation + +## Objective + +Capture comprehensive documentation for the new AI platform's CLI configuration and hooks system, creating a local reference that will guide the implementation phases. + +## Task + +Research the target platform's official documentation and create two focused documentation files that will serve as the foundation for implementing platform support in DeepWork. -### integrate +### Process -Full workflow to integrate a new AI platform into DeepWork +1. **Identify the platform's documentation sources** + - Find the official documentation website + - Locate the CLI/agent configuration documentation + - Find the hooks or customization system documentation + - Note: Focus ONLY on slash command/custom command hooks, not general CLI hooks -**Steps in order**: -1. **research** - Captures CLI configuration and hooks system documentation for the new platform. Use when starting platform integration. -2. **add_capabilities** - Updates job schema and adapters with any new hook events the platform supports. Use after research to extend DeepWork's hook system. -3. **implement** - Creates platform adapter, templates, tests with 100% coverage, and README documentation. Use after adding hook capabilities. -4. **verify** - Sets up platform directories and verifies deepwork install works correctly. Use after implementation to confirm integration. +2. **Gather CLI configuration documentation** + - How is the CLI configured? (config files, environment variables, etc.) + - Where are custom commands/skills stored? + - What is the command file format? (markdown, YAML, etc.) + - What metadata or frontmatter is supported? + - How does the platform discover and load commands? -**Start workflow**: `/add_platform.research` +3. **Gather hooks system documentation** + - What hooks are available for custom command definitions? + - Focus on hooks that trigger during or after command execution + - Examples: `stop_hooks`, `pre_hooks`, `post_hooks`, validation hooks + - Document the syntax and available hook types + - **Important**: Only document hooks available on slash command definitions, not general CLI hooks +4. **Create the documentation files** + - Place files in `doc/platforms//` + - Each file must have a header comment with source and date + - Content should be comprehensive but focused -## Execution Instructions +## Output Format -### Step 1: Analyze Intent +### cli_configuration.md -Parse any text following `/add_platform` to determine user intent: -- "integrate" or related terms → start integrate workflow at `add_platform.research` +Located at: `doc/platforms//cli_configuration.md` -### Step 2: Invoke Starting Step +**Structure**: +```markdown + -Use the Skill tool to invoke the identified starting step: +# CLI Configuration + +## Overview + +[Brief description of the platform and its CLI/agent system] + +## Configuration Files + +[Document where configuration lives and its format] + +### File Locations + +- [Location 1]: [Purpose] +- [Location 2]: [Purpose] + +### Configuration Format + +[Show the configuration file format with examples] + +## Custom Commands/Skills + +[Document how custom commands are defined] + +### Command Location + +[Where command files are stored] + +### Command File Format + +[The format of command files - markdown, YAML, etc.] + +### Metadata/Frontmatter + +[What metadata fields are supported in command files] + +```[format] +[Example of a minimal command file] ``` -Skill tool: add_platform.research + +## Command Discovery + +[How the platform discovers and loads commands] + +## Platform-Specific Features + +[Any unique features relevant to command configuration] +``` + +### hooks_system.md + +Located at: `doc/platforms//hooks_system.md` + +**Structure**: +```markdown + + +# Hooks System (Command Definitions) + +## Overview + +[Brief description of hooks available for command definitions] + +**Important**: This document covers ONLY hooks available within slash command/skill definitions, not general CLI hooks. + +## Available Hooks + +### [Hook Name 1] + +**Purpose**: [What this hook does] + +**Syntax**: +```yaml +[hook_name]: + - [configuration] +``` + +**Example**: +```yaml +[Complete example of using this hook] ``` -### Step 3: Continue Workflow Automatically +**Behavior**: [When and how this hook executes] + +### [Hook Name 2] + +[Repeat for each available hook] + +## Hook Execution Order + +[Document the order in which hooks execute, if multiple are supported] + +## Comparison with Other Platforms + +| Feature | | Claude Code | Other | +|---------|-----------|-------------|-------| +| [Feature 1] | [Support] | [Support] | [Support] | + +## Limitations + +[Any limitations or caveats about the hooks system] +``` + +## Quality Criteria + +- Both files exist in `doc/platforms//` +- Each file has a header comment with: + - Last updated date (YYYY-MM-DD format) + - Source URL where documentation was obtained +- `cli_configuration.md` comprehensively covers: + - Configuration file locations and format + - Custom command file format and location + - Command discovery mechanism +- `hooks_system.md` comprehensively covers: + - All hooks available for slash command definitions + - Syntax and examples for each hook + - NOT general CLI hooks (only command-level hooks) +- Documentation is detailed enough to implement the platform adapter +- No extraneous topics (only CLI config and command hooks) +- When all criteria are met, include `✓ Quality Criteria Met` in your response + +## Context + +This is the foundation step for adding a new platform to DeepWork. The documentation you capture here will be referenced throughout the implementation process: +- CLI configuration informs how to generate command files +- Hooks documentation determines what features the adapter needs to support +- This documentation becomes a permanent reference in `doc/platforms/` + +Take time to be thorough - incomplete documentation will slow down subsequent steps. + +## Tips + +- Use the platform's official documentation as the primary source +- If documentation is sparse, check GitHub repos, community guides, or changelog entries +- When in doubt about whether something is a "command hook" vs "CLI hook", err on the side of inclusion and note the ambiguity +- Include code examples from the official docs where available + + +#### Outputs + +Create these files/directories: +- `cli_configuration.md`- `hooks_system.md` + +#### On Completion + +1. Verify outputs are created +2. Inform user: "research complete, outputs: cli_configuration.md, hooks_system.md" +3. **Continue to next skill**: Proceed to `add_capabilities` + +--- + +### Skill: add_capabilities + +**Type**: Workflow step 2/4 in **integrate** + +**Description**: Updates job schema and adapters with any new hook events the platform supports. Use after research to extend DeepWork's hook system. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `research` + + +#### Input Files + +Read these files (from previous steps): +- `hooks_system.md` (from `research`) + +#### Instructions + +# Add Hook Capabilities + +## Objective + +Update the DeepWork job schema and platform adapters to support any new hook events that the new platform provides for slash command definitions. + +## Task + +Analyze the hooks documentation from the research step and update the codebase to support any new hook capabilities, ensuring consistency across all existing adapters. + +### Prerequisites + +Read the hooks documentation created in the previous step: +- `doc/platforms//hooks_system.md` + +Also review the existing schema and adapters: +- `src/deepwork/schemas/job_schema.py` +- `src/deepwork/adapters.py` + +### Process + +1. **Analyze the new platform's hooks** + - Read `doc/platforms//hooks_system.md` + - List all hooks available for slash command definitions + - Compare with hooks already in `job_schema.py` + - Identify any NEW hooks not currently supported + +2. **Determine if schema changes are needed** + - If the platform has hooks that DeepWork doesn't currently support, add them + - If all hooks are already supported, document this finding + - Remember: Only add hooks that are available on slash command definitions + +3. **Update job_schema.py (if needed)** + - Add new hook fields to the step schema + - Follow existing patterns for hook definitions + - Add appropriate type hints and documentation + - Example addition: + ```python + # New hook from + new_hook_name: Optional[List[HookConfig]] = None + ``` + +4. **Update all existing adapters** + - Open `src/deepwork/adapters.py` + - For EACH existing adapter class: + - Add the new hook field (set to `None` if not supported) + - This maintains consistency across all adapters + - Document why each adapter does or doesn't support the hook + +5. **Validate the changes** + - Run Python syntax check: `python -m py_compile src/deepwork/schemas/job_schema.py` + - Run Python syntax check: `python -m py_compile src/deepwork/adapters.py` + - Ensure no import errors + +6. **Document the decision** + - If no new hooks were added, add a comment explaining why + - If new hooks were added, ensure they're documented in the schema + +## Output Format + +### job_schema.py + +Location: `src/deepwork/schemas/job_schema.py` + +If new hooks are added: +```python +@dataclass +class StepDefinition: + # ... existing fields ... + + # New hook from - [description of what it does] + new_hook_name: Optional[List[HookConfig]] = None +``` + +### adapters.py + +Location: `src/deepwork/adapters.py` + +For each existing adapter, add the new hook field: +```python +class ExistingPlatformAdapter(PlatformAdapter): + # ... existing code ... + + def get_hook_support(self) -> dict: + return { + # ... existing hooks ... + "new_hook_name": None, # Not supported by this platform + } +``` + +Or if no changes are needed, add a documentation comment: +```python +# NOTE: hooks reviewed on YYYY-MM-DD +# No new hooks to add - all command hooks are already +# supported by the existing schema (stop_hooks covers their validation pattern) +``` + +## Quality Criteria + +- Hooks documentation from research step has been reviewed +- If new hooks exist: + - Added to `src/deepwork/schemas/job_schema.py` with proper typing + - ALL existing adapters updated in `src/deepwork/adapters.py` + - Each adapter indicates support level (implemented, None, or partial) +- If no new hooks needed: + - Decision documented with a comment explaining the analysis +- Only hooks available on slash command definitions are considered +- `job_schema.py` has no syntax errors (verified with py_compile) +- `adapters.py` has no syntax errors (verified with py_compile) +- All adapters have consistent hook fields (same fields across all adapters) +- When all criteria are met, include `✓ Quality Criteria Met` in your response + +## Context + +DeepWork supports multiple AI platforms, and each platform may have different capabilities for hooks within command definitions. The schema defines what hooks CAN exist, while adapters define what each platform actually SUPPORTS. + +This separation allows: +- Job definitions to use any hook (the schema is the superset) +- Platform-specific generation to only use supported hooks (adapters filter) +- Future platforms to add new hooks without breaking existing ones + +Maintaining consistency is critical - all adapters must have the same hook fields, even if they don't support them (use `None` for unsupported). + +## Common Hook Types + +For reference, here are common hook patterns across platforms: + +| Hook Type | Purpose | Example Platforms | +|-----------|---------|-------------------| +| `stop_hooks` | Quality validation loops | Claude Code | +| `pre_hooks` | Run before command | Various | +| `post_hooks` | Run after command | Various | +| `validation_hooks` | Validate inputs/outputs | Various | -After each step completes: -1. Check if there's a next step in the workflow sequence -2. Invoke the next step using the Skill tool -3. Repeat until workflow is complete or user intervenes +When you find a new hook type, consider whether it maps to an existing pattern or is genuinely new functionality. -**Note**: Standalone skills do not auto-continue to other steps. -### Handling Ambiguous Intent +#### Outputs -If user intent is unclear, use AskUserQuestion to clarify: -- Present available workflows and standalone skills as options -- Let user select the starting point +Create these files/directories: +- `job_schema.py`- `adapters.py` + +#### On Completion + +1. Verify outputs are created +2. Inform user: "add_capabilities complete, outputs: job_schema.py, adapters.py" +3. **Continue to next skill**: Proceed to `implement` + +--- + +### Skill: implement + +**Type**: Workflow step 3/4 in **integrate** + +**Description**: Creates platform adapter, templates, tests with 100% coverage, and README documentation. Use after adding hook capabilities. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `research` +- `add_capabilities` + + +#### Input Files + +Read these files (from previous steps): +- `job_schema.py` (from `add_capabilities`) +- `adapters.py` (from `add_capabilities`) +- `cli_configuration.md` (from `research`) + +#### Instructions + +# Implement Platform Support + +## Objective + +Create the complete platform implementation including the adapter class, command templates, comprehensive tests, and documentation updates. + +## Task + +Build the full platform support by implementing the adapter, creating templates, writing tests with 100% coverage, and updating the README. + +### Prerequisites + +Read the outputs from previous steps: +- `doc/platforms//cli_configuration.md` - For template structure +- `src/deepwork/schemas/job_schema.py` - For current schema +- `src/deepwork/adapters.py` - For adapter patterns + +Also review existing implementations for reference: +- `src/deepwork/templates/claude/` - Example templates +- `tests/` - Existing test patterns + +### Process + +1. **Create the platform adapter class** + + Add a new adapter class to `src/deepwork/adapters.py`: + + ```python + class NewPlatformAdapter(PlatformAdapter): + """Adapter for .""" + + platform_name = "" + command_directory = "" # e.g., ".cursor/commands" + command_extension = ".md" # or appropriate extension + + def get_hook_support(self) -> dict: + """Return which hooks this platform supports.""" + return { + "stop_hooks": True, # or False/None + # ... other hooks + } + + def generate_command(self, step: StepDefinition, job: JobDefinition) -> str: + """Generate command file content for this platform.""" + # Use Jinja2 template + template = self.env.get_template(f"{self.platform_name}/command.md.j2") + return template.render(step=step, job=job) + ``` + +2. **Create command templates** + + Create templates in `src/deepwork/templates//`: + + - `command.md.j2` - Main command template + - Any other templates needed for the platform's format + + Use the CLI configuration documentation to ensure the template matches the platform's expected format. + +3. **Register the adapter** + + Update the adapter registry in `src/deepwork/adapters.py`: + + ```python + PLATFORM_ADAPTERS = { + "claude": ClaudeAdapter, + "": NewPlatformAdapter, + # ... other adapters + } + ``` + +4. **Write comprehensive tests** + + Create tests in `tests/` that cover: + + - Adapter instantiation + - Hook support detection + - Command generation + - Template rendering + - Edge cases (empty inputs, special characters, etc.) + - Integration with the sync command + + **Critical**: Tests must achieve 100% coverage of new code. + +5. **Update README.md** + + Add the new platform to `README.md`: + + - Add to "Supported Platforms" list + - Add installation instructions: + ```bash + deepwork install --platform + ``` + - Document any platform-specific notes or limitations + +6. **Run tests and verify coverage** + + ```bash + uv run pytest --cov=src/deepwork --cov-report=term-missing + ``` + + - All tests must pass + - New code must have 100% coverage + - If coverage is below 100%, add more tests + +7. **Iterate until tests pass with full coverage** + + This step has a `stop_hooks` script that runs tests. Keep iterating until: + - All tests pass + - Coverage is 100% for new functionality + +## Output Format + +### templates/ + +Location: `src/deepwork/templates//` + +Create the following files: + +**command.md.j2**: +```jinja2 +{# Template for command files #} +{# Follows the platform's expected format from cli_configuration.md #} + +[Platform-specific frontmatter or metadata] + +# {{ step.name }} + +{{ step.description }} + +## Instructions + +{{ step.instructions_content }} + +[... rest of template based on platform format ...] +``` + +### tests/ + +Location: `tests/test__adapter.py` + +```python +"""Tests for the adapter.""" +import pytest +from deepwork.adapters import NewPlatformAdapter + +class TestNewPlatformAdapter: + """Test suite for NewPlatformAdapter.""" + + def test_adapter_initialization(self): + """Test adapter can be instantiated.""" + adapter = NewPlatformAdapter() + assert adapter.platform_name == "" + + def test_hook_support(self): + """Test hook support detection.""" + adapter = NewPlatformAdapter() + hooks = adapter.get_hook_support() + assert "stop_hooks" in hooks + # ... more assertions + + def test_command_generation(self): + """Test command file generation.""" + # ... test implementation + + # ... more tests for 100% coverage +``` + +### README.md + +Add to the existing README.md: + +```markdown +## Supported Platforms + +- **Claude Code** - Anthropic's CLI for Claude +- **** - [Brief description] + +## Installation + +### + +```bash +deepwork install --platform +``` + +[Any platform-specific notes] +``` + +## Quality Criteria + +- Platform adapter class added to `src/deepwork/adapters.py`: + - Inherits from `PlatformAdapter` + - Implements all required methods + - Registered in `PLATFORM_ADAPTERS` +- Templates created in `src/deepwork/templates//`: + - `command.md.j2` exists and renders correctly + - Format matches platform's expected command format +- Tests created in `tests/`: + - Cover all new adapter functionality + - Cover template rendering + - All tests pass +- Test coverage is 100% for new code: + - Run `uv run pytest --cov=src/deepwork --cov-report=term-missing` + - No uncovered lines in new code +- README.md updated: + - Platform listed in supported platforms + - Installation command documented + - Any platform-specific notes included +- When all criteria are met, include `✓ Quality Criteria Met` in your response + +## Context + +This is the core implementation step. The adapter you create will be responsible for: +- Determining where command files are placed +- Generating command file content from job definitions +- Handling platform-specific features and hooks + +The templates use Jinja2 and should produce files that match exactly what the platform expects. Reference the CLI configuration documentation frequently to ensure compatibility. + +## Tips + +- Study the existing `ClaudeAdapter` as a reference implementation +- Run tests frequently as you implement +- Use `--cov-report=html` for a detailed coverage report +- If a test is hard to write, the code might need refactoring +- Template syntax errors often show up at runtime - test early + + +#### Outputs + +Create these files/directories: +- `templates/` (directory)- `tests/` (directory)- `README.md` + +#### On Completion + +1. Verify outputs are created +2. Inform user: "implement complete, outputs: templates/, tests/, README.md" +3. **Continue to next skill**: Proceed to `verify` + +--- + +### Skill: verify + +**Type**: Workflow step 4/4 in **integrate** + +**Description**: Sets up platform directories and verifies deepwork install works correctly. Use after implementation to confirm integration. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `implement` + + +#### Input Files + +Read these files (from previous steps): +- `templates/` (from `implement`) + +#### Instructions + +# Verify Installation + +## Objective + +Ensure the new platform integration works correctly by setting up necessary directories and running the full installation process. + +## Task + +Perform end-to-end verification that the new platform can be installed and that DeepWork's standard jobs work correctly with it. + +### Prerequisites + +Ensure the implementation step is complete: +- Adapter class exists in `src/deepwork/adapters.py` +- Templates exist in `src/deepwork/templates//` +- Tests pass with 100% coverage +- README.md is updated + +### Process + +1. **Set up platform directories in the DeepWork repo** + + The DeepWork repository itself should have the platform's command directory structure for testing: + + ```bash + mkdir -p + ``` + + For example: + - Claude: `.claude/commands/` + - Cursor: `.cursor/commands/` (or wherever Cursor stores commands) + +2. **Run deepwork install for the new platform** + + ```bash + deepwork install --platform + ``` + + Verify: + - Command completes without errors + - No Python exceptions or tracebacks + - Output indicates successful installation + +3. **Check that command files were created** + + List the generated command files: + ```bash + ls -la / + ``` + + Verify: + - `deepwork_jobs.define.md` exists (or equivalent for the platform) + - `deepwork_jobs.implement.md` exists + - `deepwork_jobs.refine.md` exists + - `deepwork_rules.define.md` exists + - All expected step commands exist + +4. **Validate command file content** + + Read each generated command file and verify: + - Content matches the expected format for the platform + - Job metadata is correctly included + - Step instructions are properly rendered + - Any platform-specific features (hooks, frontmatter) are present + +5. **Test alongside existing platforms** + + If other platforms are already installed, verify they still work: + ```bash + deepwork install --platform claude + ls -la .claude/commands/ + ``` + + Ensure: + - New platform doesn't break existing installations + - Each platform's commands are independent + - No file conflicts or overwrites + +## Quality Criteria + +- Platform-specific directories are set up in the DeepWork repo +- `deepwork install --platform ` completes without errors +- All expected command files are created: + - deepwork_jobs.define, implement, refine + - deepwork_rules.define + - Any other standard job commands +- Command file content is correct: + - Matches platform's expected format + - Job/step information is properly rendered + - No template errors or missing content +- Existing platforms still work (if applicable) +- No conflicts between platforms +- When all criteria are met, include `✓ Quality Criteria Met` in your response + +## Context + +This is the final validation step before the platform is considered complete. A thorough verification ensures: +- The platform actually works, not just compiles +- Standard DeepWork jobs install correctly +- The platform integrates properly with the existing system +- Users can confidently use the new platform + +Take time to verify each aspect - finding issues now is much better than having users discover them later. + +## Common Issues to Check + +- **Template syntax errors**: May only appear when rendering specific content +- **Path issues**: Platform might expect different directory structure +- **Encoding issues**: Special characters in templates or content +- **Missing hooks**: Platform adapter might not handle all hook types +- **Permission issues**: Directory creation might fail in some cases + + +#### Outputs + +Create these files/directories: +- `verification_checklist.md` + +#### On Completion + +1. Verify outputs are created +2. Inform user: "integrate workflow complete, outputs: verification_checklist.md" +3. Consider creating a PR to merge the work branch + +--- ## Guardrails -- Do NOT copy/paste step instructions directly; always use the Skill tool to invoke steps -- Do NOT skip steps in a workflow unless the user explicitly requests it -- Do NOT proceed to the next step if the current step's outputs are incomplete -- Do NOT make assumptions about user intent; ask for clarification when ambiguous +- **Never skip prerequisites**: Always verify required steps are complete before running a skill +- **Never produce partial outputs**: Complete all required outputs before marking a skill done +- **Always use the work branch**: Never commit directly to main/master +- **Follow quality criteria**: Use sub-agent review when quality criteria are specified +- **Ask for clarification**: If user intent is unclear, ask before proceeding ## Context Files -- Job definition: `.deepwork/jobs/add_platform/job.yml` \ No newline at end of file +- Job definition: `.deepwork/jobs/add_platform/job.yml` +- research instructions: `.deepwork/jobs/add_platform/steps/research.md` +- add_capabilities instructions: `.deepwork/jobs/add_platform/steps/add_capabilities.md` +- implement instructions: `.deepwork/jobs/add_platform/steps/implement.md` +- verify instructions: `.deepwork/jobs/add_platform/steps/verify.md` diff --git a/.claude/skills/commit.commit_and_push/SKILL.md b/.claude/skills/commit.commit_and_push/SKILL.md deleted file mode 100644 index 1ec4e5c9..00000000 --- a/.claude/skills/commit.commit_and_push/SKILL.md +++ /dev/null @@ -1,179 +0,0 @@ ---- -name: commit.commit_and_push -description: "Verifies changed files, creates commit, and pushes to remote. Use after linting passes to finalize changes." -user-invocable: false - ---- - -# commit.commit_and_push - -**Step 4/4** in **full** workflow - -> Full commit workflow: review, test, lint, and commit - -> Reviews code, runs tests, lints, and commits changes. Use when ready to commit work with quality checks. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/commit.lint` - -## Instructions - -**Goal**: Verifies changed files, creates commit, and pushes to remote. Use after linting passes to finalize changes. - -# Commit and Push - -## Objective - -Review the changed files to verify they match the agent's expectations, create a commit with an appropriate message, and push to the remote repository. - -## Task - -Check the list of changed files against what was modified during this session, ensure they match expectations, then commit and push the changes. - -### Process - -1. **Get the list of changed files** - ```bash - git status - ``` - Also run `git diff --stat` to see a summary of changes. - -2. **Verify changes match expectations** - - Compare the changed files against what you modified during this session: - - Do the modified files match what you edited? - - Are there any unexpected new files? - - Are there any unexpected deleted files? - - Do the line counts seem reasonable for the changes you made? - - If changes match expectations, proceed to the next step. - - If there are unexpected changes: - - Investigate why (e.g., lint auto-fixes, generated files) - - If they're legitimate side effects of your work, include them - - If they're unrelated or shouldn't be committed, use `git restore` to discard them - -3. **Update CHANGELOG.md if needed** - - If your changes include new features, bug fixes, or other notable changes: - - Add entries to the `## [Unreleased]` section of CHANGELOG.md - - Use the appropriate subsection: `### Added`, `### Changed`, `### Fixed`, or `### Removed` - - Write concise descriptions that explain the user-facing impact - - **CRITICAL: NEVER modify version numbers** - - Do NOT change the version in `pyproject.toml` - - Do NOT change version headers in CHANGELOG.md (e.g., `## [0.4.2]`) - - Do NOT rename the `## [Unreleased]` section - - Version updates are handled by the release workflow, not commits - -4. **Stage all appropriate changes** - ```bash - git add -A - ``` - Or stage specific files if some were excluded. - -5. **View recent commit messages for style reference** - ```bash - git log --oneline -10 - ``` - -6. **Create the commit** - - Generate an appropriate commit message based on: - - The changes made - - The style of recent commits - - Conventional commit format if the project uses it - - **IMPORTANT:** Use the commit job script (not `git commit` directly): - ```bash - .claude/hooks/commit_job_git_commit.sh -m "commit message here" - ``` - -7. **Push to remote** - ```bash - git push - ``` - If the branch has no upstream, use: - ```bash - git push -u origin HEAD - ``` - -## Quality Criteria - -- Changed files were verified against expectations -- CHANGELOG.md was updated with entries in [Unreleased] section (if changes warrant documentation) -- Version numbers were NOT modified (pyproject.toml version and CHANGELOG version headers unchanged) -- Commit was created with appropriate message -- Changes were pushed to remote - -## Context - -This is the final step of the commit workflow. The agent verifies that the changed files match its own expectations from the work done during the session, then commits and pushes. This catches unexpected changes while avoiding unnecessary user interruptions. - - -### Job Context - -A workflow for preparing and committing code changes with quality checks. - -The **full** workflow starts with a code review to catch issues early, runs tests until -they pass, formats and lints code with ruff, then reviews changed files -before committing and pushing. The review and lint steps use sub-agents -to reduce context usage. - -Steps: -1. review - Code review for issues, DRY opportunities, naming, and test coverage (runs in sub-agent) -2. test - Pull latest code and run tests until they pass -3. lint - Format and lint code with ruff (runs in sub-agent) -4. commit_and_push - Review changes and commit/push - - - -## Work Branch - -Use branch format: `deepwork/commit-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/commit-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `changes_committed` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. Changed files were verified against expectations -2. CHANGELOG.md was updated with entries in [Unreleased] section (if changes warrant documentation) -3. Version numbers were NOT modified (pyproject.toml version and CHANGELOG version headers unchanged) -4. Commit was created with appropriate message -5. Changes were pushed to remote -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "full step 4/4 complete, outputs: changes_committed" -3. **full workflow complete**: All steps finished. Consider creating a PR to merge the work branch. - ---- - -**Reference files**: `.deepwork/jobs/commit/job.yml`, `.deepwork/jobs/commit/steps/commit_and_push.md` \ No newline at end of file diff --git a/.claude/skills/commit.lint/SKILL.md b/.claude/skills/commit.lint/SKILL.md deleted file mode 100644 index 0f16873e..00000000 --- a/.claude/skills/commit.lint/SKILL.md +++ /dev/null @@ -1,158 +0,0 @@ ---- -name: commit.lint -description: "Formats and lints code with ruff using a sub-agent. Use after tests pass to ensure code style compliance." -user-invocable: false - ---- - -# commit.lint - -**Step 3/4** in **full** workflow - -> Full commit workflow: review, test, lint, and commit - -> Reviews code, runs tests, lints, and commits changes. Use when ready to commit work with quality checks. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/commit.test` - -## Instructions - -**Goal**: Formats and lints code with ruff using a sub-agent. Use after tests pass to ensure code style compliance. - -# Lint Code - -## Objective - -Format and lint the codebase using ruff to ensure code quality and consistency. - -## Task - -Run ruff format and ruff check to format and lint the code. This step should be executed using a sub-agent to conserve context in the main conversation. - -### Process - -**IMPORTANT**: Use the Task tool to spawn a sub-agent for this work. This saves context in the main conversation. Use the `haiku` model for speed. - -1. **Spawn a sub-agent to run linting** - - Use the Task tool with these parameters: - - `subagent_type`: "Bash" - - `model`: "haiku" - - `prompt`: See below - - The sub-agent should: - - a. **Run ruff format** - ```bash - ruff format . - ``` - This formats the code according to ruff's style rules. - - b. **Run ruff check with auto-fix** - ```bash - ruff check --fix . - ``` - This checks for lint errors and automatically fixes what it can. - - c. **Run ruff check again to verify** - ```bash - ruff check . - ``` - Capture the final output to verify no remaining issues. - -2. **Review sub-agent results** - - Check that both format and check completed successfully - - Note any remaining lint issues that couldn't be auto-fixed - -3. **Handle remaining issues** - - If there are lint errors that couldn't be auto-fixed, fix them manually - - Re-run ruff check to verify - -## Example Sub-Agent Prompt - -``` -Run ruff to format and lint the codebase: - -1. Run: ruff format . -2. Run: ruff check --fix . -3. Run: ruff check . (to verify no remaining issues) - -Report the results of each command. -``` - -## Quality Criteria - -- ruff format was run successfully -- ruff check was run with --fix flag -- No remaining lint errors - -## Context - -This step ensures code quality and consistency before committing. It runs after tests pass and before the commit step. Using a sub-agent keeps the main conversation context clean for the commit review. - - -### Job Context - -A workflow for preparing and committing code changes with quality checks. - -The **full** workflow starts with a code review to catch issues early, runs tests until -they pass, formats and lints code with ruff, then reviews changed files -before committing and pushing. The review and lint steps use sub-agents -to reduce context usage. - -Steps: -1. review - Code review for issues, DRY opportunities, naming, and test coverage (runs in sub-agent) -2. test - Pull latest code and run tests until they pass -3. lint - Format and lint code with ruff (runs in sub-agent) -4. commit_and_push - Review changes and commit/push - - - -## Work Branch - -Use branch format: `deepwork/commit-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/commit-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `code_formatted` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. ruff format was run successfully -2. ruff check was run with --fix flag -3. No remaining lint errors -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "full step 3/4 complete, outputs: code_formatted" -3. **Continue workflow**: Use Skill tool to invoke `/commit.commit_and_push` - ---- - -**Reference files**: `.deepwork/jobs/commit/job.yml`, `.deepwork/jobs/commit/steps/lint.md` \ No newline at end of file diff --git a/.claude/skills/commit.review/SKILL.md b/.claude/skills/commit.review/SKILL.md deleted file mode 100644 index f4074077..00000000 --- a/.claude/skills/commit.review/SKILL.md +++ /dev/null @@ -1,139 +0,0 @@ ---- -name: commit.review -description: "Reviews changed code for issues, DRY opportunities, naming clarity, and test coverage using a sub-agent. Use as the first step before testing." -user-invocable: false - ---- - -# commit.review - -**Step 1/4** in **full** workflow - -> Full commit workflow: review, test, lint, and commit - -> Reviews code, runs tests, lints, and commits changes. Use when ready to commit work with quality checks. - - -## Instructions - -**Goal**: Reviews changed code for issues, DRY opportunities, naming clarity, and test coverage using a sub-agent. Use as the first step before testing. - -# Code Review - -## Objective - -Review changed code for quality issues before running tests. This catches problems early and ensures code meets quality standards. - -## Task - -Use a sub-agent to review the staged/changed code and identify issues that should be fixed before committing. - -### Process - -**IMPORTANT**: Use the Task tool to spawn a sub-agent for this review. This saves context in the main conversation. - -1. **Get the list of changed files** - ```bash - git diff --name-only HEAD - git diff --name-only --staged - ``` - Combine these to get all files that have been modified. - -2. **Spawn a sub-agent to review the code** - - Use the Task tool with these parameters: - - `subagent_type`: "general-purpose" - - `prompt`: Instruct the sub-agent to: - - Read the code review standards from `doc/code_review_standards.md` - - Read each of the changed files - - Review each file against the standards - - Report issues found with file, line number, severity, and suggested fix - -3. **Review sub-agent findings** - - Examine each issue identified - - Prioritize issues by severity - -4. **Fix identified issues** - - Address each issue found by the review - - For DRY violations: extract shared code into functions/modules - - For naming issues: rename to be clearer - - For missing tests: add appropriate test cases - - For bugs: fix the underlying issue - -5. **Re-run review if significant changes made** - - If you made substantial changes, consider running another review pass - - Ensure fixes didn't introduce new issues - -## Quality Criteria - -- Changed files were identified -- Sub-agent read the code review standards and reviewed all changed files -- All identified issues were addressed or documented as intentional - -## Context - -This is the first step of the commit workflow. Code review happens before tests to catch quality issues early. The sub-agent approach keeps the main conversation context clean while providing thorough review coverage. - - -### Job Context - -A workflow for preparing and committing code changes with quality checks. - -The **full** workflow starts with a code review to catch issues early, runs tests until -they pass, formats and lints code with ruff, then reviews changed files -before committing and pushing. The review and lint steps use sub-agents -to reduce context usage. - -Steps: -1. review - Code review for issues, DRY opportunities, naming, and test coverage (runs in sub-agent) -2. test - Pull latest code and run tests until they pass -3. lint - Format and lint code with ruff (runs in sub-agent) -4. commit_and_push - Review changes and commit/push - - - -## Work Branch - -Use branch format: `deepwork/commit-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/commit-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `code_reviewed` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. Changed files were identified -2. Sub-agent reviewed the code for general issues, DRY opportunities, naming clarity, and test coverage -3. All identified issues were addressed or documented as intentional -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "full step 1/4 complete, outputs: code_reviewed" -3. **Continue workflow**: Use Skill tool to invoke `/commit.test` - ---- - -**Reference files**: `.deepwork/jobs/commit/job.yml`, `.deepwork/jobs/commit/steps/review.md` \ No newline at end of file diff --git a/.claude/skills/commit.test/SKILL.md b/.claude/skills/commit.test/SKILL.md deleted file mode 100644 index cddf224a..00000000 --- a/.claude/skills/commit.test/SKILL.md +++ /dev/null @@ -1,138 +0,0 @@ ---- -name: commit.test -description: "Pulls latest code and runs tests until all pass. Use after code review passes to verify changes work correctly." -user-invocable: false - ---- - -# commit.test - -**Step 2/4** in **full** workflow - -> Full commit workflow: review, test, lint, and commit - -> Reviews code, runs tests, lints, and commits changes. Use when ready to commit work with quality checks. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/commit.review` - -## Instructions - -**Goal**: Pulls latest code and runs tests until all pass. Use after code review passes to verify changes work correctly. - -# Run Tests - -## Objective - -Run the project's test suite and fix any failing tests until all tests pass. - -## Task - -Execute the test suite for the project and iteratively fix any failures until all tests pass. - -### Process - -1. **Pull latest code from the branch** - - Run `git pull` to fetch and merge any changes from the remote - - If there are merge conflicts, resolve them before proceeding - - This ensures you're testing against the latest code - -2. **Detect or use the test command** - - If a test command was provided, use that - - Otherwise, auto-detect the project type and determine the appropriate test command: - - Python: `pytest`, `python -m pytest`, `uv run pytest` - - Node.js: `npm test`, `yarn test`, `bun test` - - Go: `go test ./...` - - Rust: `cargo test` - - Check `package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod` for hints - -3. **Run the tests** - - Execute the test command - - Capture the output - -4. **Analyze failures** - - If tests pass, proceed to output - - If tests fail, analyze the failure messages - - Identify the root cause of each failure - -5. **Fix failing tests** - - Make the necessary code changes to fix failures - - This may involve fixing bugs in implementation code or updating tests - - Re-run tests after each fix - -6. **Iterate until passing** - - Continue the fix/test cycle until all tests pass - -## Quality Criteria - -- Latest code was pulled from the branch -- All tests are passing - -## Context - -This step runs after code review. Tests must pass before proceeding to lint and commit. This ensures code quality and prevents broken code from being committed. If tests fail due to issues introduced by the code review fixes, iterate on the fixes until tests pass. - - -### Job Context - -A workflow for preparing and committing code changes with quality checks. - -The **full** workflow starts with a code review to catch issues early, runs tests until -they pass, formats and lints code with ruff, then reviews changed files -before committing and pushing. The review and lint steps use sub-agents -to reduce context usage. - -Steps: -1. review - Code review for issues, DRY opportunities, naming, and test coverage (runs in sub-agent) -2. test - Pull latest code and run tests until they pass -3. lint - Format and lint code with ruff (runs in sub-agent) -4. commit_and_push - Review changes and commit/push - - - -## Work Branch - -Use branch format: `deepwork/commit-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/commit-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `tests_passing` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. Latest code was pulled from the branch -2. All tests are passing -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "full step 2/4 complete, outputs: tests_passing" -3. **Continue workflow**: Use Skill tool to invoke `/commit.lint` - ---- - -**Reference files**: `.deepwork/jobs/commit/job.yml`, `.deepwork/jobs/commit/steps/test.md` \ No newline at end of file diff --git a/.claude/skills/commit/SKILL.md b/.claude/skills/commit/SKILL.md index 3839ffbd..3db8cf18 100644 --- a/.claude/skills/commit/SKILL.md +++ b/.claude/skills/commit/SKILL.md @@ -3,12 +3,10 @@ name: commit description: "Reviews code, runs tests, lints, and commits changes. Use when ready to commit work with quality checks." --- -# commit +# commit Agent Reviews code, runs tests, lints, and commits changes. Use when ready to commit work with quality checks. -> **CRITICAL**: Always invoke steps using the Skill tool. Never copy/paste step instructions directly. - A workflow for preparing and committing code changes with quality checks. The **full** workflow starts with a code review to catch issues early, runs tests until @@ -23,57 +21,486 @@ Steps: 4. commit_and_push - Review changes and commit/push -## Workflows +## Agent Overview + +This agent handles the **commit** job with 4 skills. + +**Workflows**: full +--- + +## How to Use This Agent + +### Workflows +- **full**: Full commit workflow: review, test, lint, and commit (review → test → lint → commit_and_push) + - Start: `review` + +### All Skills +- `review` - Reviews changed code for issues, DRY opportunities, naming clarity, and test coverage using a sub-agent. Use as the first step before testing. +- `test` - Pulls latest code and runs tests until all pass. Use after code review passes to verify changes work correctly. +- `lint` - Formats and lints code with ruff using a sub-agent. Use after tests pass to ensure code style compliance. +- `commit_and_push` - Verifies changed files, creates commit, and pushes to remote. Use after linting passes to finalize changes. + +--- + +## Agent Execution Instructions + +When invoked, follow these steps: + +### Step 1: Understand Intent + +Parse the user's request to determine: +1. Which workflow or skill to execute +2. Any parameters or context provided +3. Whether this is a continuation of previous work + +### Step 2: Check Work Branch + +Before executing any skill: +1. Check current git branch +2. If on a `deepwork/commit-*` branch: continue using it +3. If on main/master: create new branch `deepwork/commit-[instance]-$(date +%Y%m%d)` + +### Step 3: Execute the Appropriate Skill + +Navigate to the relevant skill section below and follow its instructions. + +### Step 4: Workflow Continuation + +After completing a workflow step: +1. Inform the user of completion and outputs created +2. Automatically proceed to the next step if one exists +3. Continue until the workflow is complete or the user intervenes + +--- + +## Skills + +### Skill: review + +**Type**: Workflow step 1/4 in **full** + +**Description**: Reviews changed code for issues, DRY opportunities, naming clarity, and test coverage using a sub-agent. Use as the first step before testing. + + + + +#### Instructions + +# Code Review + +## Objective + +Review changed code for quality issues before running tests. This catches problems early and ensures code meets quality standards. + +## Task + +Use a sub-agent to review the staged/changed code and identify issues that should be fixed before committing. + +### Process + +**IMPORTANT**: Use the Task tool to spawn a sub-agent for this review. This saves context in the main conversation. + +1. **Get the list of changed files** + ```bash + git diff --name-only HEAD + git diff --name-only --staged + ``` + Combine these to get all files that have been modified. + +2. **Spawn a sub-agent to review the code** + + Use the Task tool with these parameters: + - `subagent_type`: "general-purpose" + - `prompt`: Instruct the sub-agent to: + - Read the code review standards from `doc/code_review_standards.md` + - Read each of the changed files + - Review each file against the standards + - Report issues found with file, line number, severity, and suggested fix + +3. **Review sub-agent findings** + - Examine each issue identified + - Prioritize issues by severity + +4. **Fix identified issues** + - Address each issue found by the review + - For DRY violations: extract shared code into functions/modules + - For naming issues: rename to be clearer + - For missing tests: add appropriate test cases + - For bugs: fix the underlying issue + +5. **Re-run review if significant changes made** + - If you made substantial changes, consider running another review pass + - Ensure fixes didn't introduce new issues + +## Quality Criteria + +- Changed files were identified +- Sub-agent read the code review standards and reviewed all changed files +- All identified issues were addressed or documented as intentional + +## Context + +This is the first step of the commit workflow. Code review happens before tests to catch quality issues early. The sub-agent approach keeps the main conversation context clean while providing thorough review coverage. + + +#### Outputs + +Create these files/directories: +- `code_reviewed` +#### Quality Validation + +Before completing this skill, verify: +1. Changed files were identified +2. Sub-agent reviewed the code for general issues, DRY opportunities, naming clarity, and test coverage +3. All identified issues were addressed or documented as intentional + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. + +#### On Completion + +1. Verify outputs are created +2. Inform user: "review complete, outputs: code_reviewed" +3. **Continue to next skill**: Proceed to `test` + +--- + +### Skill: test + +**Type**: Workflow step 2/4 in **full** + +**Description**: Pulls latest code and runs tests until all pass. Use after code review passes to verify changes work correctly. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `review` + + + +#### Instructions + +# Run Tests + +## Objective + +Run the project's test suite and fix any failing tests until all tests pass. + +## Task + +Execute the test suite for the project and iteratively fix any failures until all tests pass. + +### Process + +1. **Pull latest code from the branch** + - Run `git pull` to fetch and merge any changes from the remote + - If there are merge conflicts, resolve them before proceeding + - This ensures you're testing against the latest code + +2. **Detect or use the test command** + - If a test command was provided, use that + - Otherwise, auto-detect the project type and determine the appropriate test command: + - Python: `pytest`, `python -m pytest`, `uv run pytest` + - Node.js: `npm test`, `yarn test`, `bun test` + - Go: `go test ./...` + - Rust: `cargo test` + - Check `package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod` for hints + +3. **Run the tests** + - Execute the test command + - Capture the output + +4. **Analyze failures** + - If tests pass, proceed to output + - If tests fail, analyze the failure messages + - Identify the root cause of each failure + +5. **Fix failing tests** + - Make the necessary code changes to fix failures + - This may involve fixing bugs in implementation code or updating tests + - Re-run tests after each fix + +6. **Iterate until passing** + - Continue the fix/test cycle until all tests pass + +## Quality Criteria + +- Latest code was pulled from the branch +- All tests are passing -### full +## Context -Full commit workflow: review, test, lint, and commit +This step runs after code review. Tests must pass before proceeding to lint and commit. This ensures code quality and prevents broken code from being committed. If tests fail due to issues introduced by the code review fixes, iterate on the fixes until tests pass. -**Steps in order**: -1. **review** - Reviews changed code for issues, DRY opportunities, naming clarity, and test coverage using a sub-agent. Use as the first step before testing. -2. **test** - Pulls latest code and runs tests until all pass. Use after code review passes to verify changes work correctly. -3. **lint** - Formats and lints code with ruff using a sub-agent. Use after tests pass to ensure code style compliance. -4. **commit_and_push** - Verifies changed files, creates commit, and pushes to remote. Use after linting passes to finalize changes. -**Start workflow**: `/commit.review` +#### Outputs +Create these files/directories: +- `tests_passing` +#### Quality Validation -## Execution Instructions +Before completing this skill, verify: +1. Latest code was pulled from the branch +2. All tests are passing -### Step 1: Analyze Intent +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. -Parse any text following `/commit` to determine user intent: -- "full" or related terms → start full workflow at `commit.review` +#### On Completion -### Step 2: Invoke Starting Step +1. Verify outputs are created +2. Inform user: "test complete, outputs: tests_passing" +3. **Continue to next skill**: Proceed to `lint` + +--- + +### Skill: lint + +**Type**: Workflow step 3/4 in **full** + +**Description**: Formats and lints code with ruff using a sub-agent. Use after tests pass to ensure code style compliance. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `test` + + + +#### Instructions + +# Lint Code + +## Objective + +Format and lint the codebase using ruff to ensure code quality and consistency. + +## Task + +Run ruff format and ruff check to format and lint the code. This step should be executed using a sub-agent to conserve context in the main conversation. + +### Process + +**IMPORTANT**: Use the Task tool to spawn a sub-agent for this work. This saves context in the main conversation. Use the `haiku` model for speed. + +1. **Spawn a sub-agent to run linting** + + Use the Task tool with these parameters: + - `subagent_type`: "Bash" + - `model`: "haiku" + - `prompt`: See below + + The sub-agent should: + + a. **Run ruff format** + ```bash + ruff format . + ``` + This formats the code according to ruff's style rules. + + b. **Run ruff check with auto-fix** + ```bash + ruff check --fix . + ``` + This checks for lint errors and automatically fixes what it can. + + c. **Run ruff check again to verify** + ```bash + ruff check . + ``` + Capture the final output to verify no remaining issues. + +2. **Review sub-agent results** + - Check that both format and check completed successfully + - Note any remaining lint issues that couldn't be auto-fixed + +3. **Handle remaining issues** + - If there are lint errors that couldn't be auto-fixed, fix them manually + - Re-run ruff check to verify + +## Example Sub-Agent Prompt -Use the Skill tool to invoke the identified starting step: ``` -Skill tool: commit.review +Run ruff to format and lint the codebase: + +1. Run: ruff format . +2. Run: ruff check --fix . +3. Run: ruff check . (to verify no remaining issues) + +Report the results of each command. ``` -### Step 3: Continue Workflow Automatically +## Quality Criteria + +- ruff format was run successfully +- ruff check was run with --fix flag +- No remaining lint errors + +## Context -After each step completes: -1. Check if there's a next step in the workflow sequence -2. Invoke the next step using the Skill tool -3. Repeat until workflow is complete or user intervenes +This step ensures code quality and consistency before committing. It runs after tests pass and before the commit step. Using a sub-agent keeps the main conversation context clean for the commit review. + + +#### Outputs + +Create these files/directories: +- `code_formatted` +#### Quality Validation + +Before completing this skill, verify: +1. ruff format was run successfully +2. ruff check was run with --fix flag +3. No remaining lint errors + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. + +#### On Completion + +1. Verify outputs are created +2. Inform user: "lint complete, outputs: code_formatted" +3. **Continue to next skill**: Proceed to `commit_and_push` + +--- -**Note**: Standalone skills do not auto-continue to other steps. +### Skill: commit_and_push -### Handling Ambiguous Intent +**Type**: Workflow step 4/4 in **full** -If user intent is unclear, use AskUserQuestion to clarify: -- Present available workflows and standalone skills as options -- Let user select the starting point +**Description**: Verifies changed files, creates commit, and pushes to remote. Use after linting passes to finalize changes. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `lint` + + + +#### Instructions + +# Commit and Push + +## Objective + +Review the changed files to verify they match the agent's expectations, create a commit with an appropriate message, and push to the remote repository. + +## Task + +Check the list of changed files against what was modified during this session, ensure they match expectations, then commit and push the changes. + +### Process + +1. **Get the list of changed files** + ```bash + git status + ``` + Also run `git diff --stat` to see a summary of changes. + +2. **Verify changes match expectations** + + Compare the changed files against what you modified during this session: + - Do the modified files match what you edited? + - Are there any unexpected new files? + - Are there any unexpected deleted files? + - Do the line counts seem reasonable for the changes you made? + + If changes match expectations, proceed to the next step. + + If there are unexpected changes: + - Investigate why (e.g., lint auto-fixes, generated files) + - If they're legitimate side effects of your work, include them + - If they're unrelated or shouldn't be committed, use `git restore` to discard them + +3. **Update CHANGELOG.md if needed** + + If your changes include new features, bug fixes, or other notable changes: + - Add entries to the `## [Unreleased]` section of CHANGELOG.md + - Use the appropriate subsection: `### Added`, `### Changed`, `### Fixed`, or `### Removed` + - Write concise descriptions that explain the user-facing impact + + **CRITICAL: NEVER modify version numbers** + - Do NOT change the version in `pyproject.toml` + - Do NOT change version headers in CHANGELOG.md (e.g., `## [0.4.2]`) + - Do NOT rename the `## [Unreleased]` section + - Version updates are handled by the release workflow, not commits + +4. **Stage all appropriate changes** + ```bash + git add -A + ``` + Or stage specific files if some were excluded. + +5. **View recent commit messages for style reference** + ```bash + git log --oneline -10 + ``` + +6. **Create the commit** + + Generate an appropriate commit message based on: + - The changes made + - The style of recent commits + - Conventional commit format if the project uses it + + **IMPORTANT:** Use the commit job script (not `git commit` directly): + ```bash + .claude/hooks/commit_job_git_commit.sh -m "commit message here" + ``` + +7. **Push to remote** + ```bash + git push + ``` + If the branch has no upstream, use: + ```bash + git push -u origin HEAD + ``` + +## Quality Criteria + +- Changed files were verified against expectations +- CHANGELOG.md was updated with entries in [Unreleased] section (if changes warrant documentation) +- Version numbers were NOT modified (pyproject.toml version and CHANGELOG version headers unchanged) +- Commit was created with appropriate message +- Changes were pushed to remote + +## Context + +This is the final step of the commit workflow. The agent verifies that the changed files match its own expectations from the work done during the session, then commits and pushes. This catches unexpected changes while avoiding unnecessary user interruptions. + + +#### Outputs + +Create these files/directories: +- `changes_committed` +#### Quality Validation + +Before completing this skill, verify: +1. Changed files were verified against expectations +2. CHANGELOG.md was updated with entries in [Unreleased] section (if changes warrant documentation) +3. Version numbers were NOT modified (pyproject.toml version and CHANGELOG version headers unchanged) +4. Commit was created with appropriate message +5. Changes were pushed to remote + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. + +#### On Completion + +1. Verify outputs are created +2. Inform user: "full workflow complete, outputs: changes_committed" +3. Consider creating a PR to merge the work branch + +--- ## Guardrails -- Do NOT copy/paste step instructions directly; always use the Skill tool to invoke steps -- Do NOT skip steps in a workflow unless the user explicitly requests it -- Do NOT proceed to the next step if the current step's outputs are incomplete -- Do NOT make assumptions about user intent; ask for clarification when ambiguous +- **Never skip prerequisites**: Always verify required steps are complete before running a skill +- **Never produce partial outputs**: Complete all required outputs before marking a skill done +- **Always use the work branch**: Never commit directly to main/master +- **Follow quality criteria**: Use sub-agent review when quality criteria are specified +- **Ask for clarification**: If user intent is unclear, ask before proceeding ## Context Files -- Job definition: `.deepwork/jobs/commit/job.yml` \ No newline at end of file +- Job definition: `.deepwork/jobs/commit/job.yml` +- review instructions: `.deepwork/jobs/commit/steps/review.md` +- test instructions: `.deepwork/jobs/commit/steps/test.md` +- lint instructions: `.deepwork/jobs/commit/steps/lint.md` +- commit_and_push instructions: `.deepwork/jobs/commit/steps/commit_and_push.md` diff --git a/.claude/skills/deepwork_jobs.define/SKILL.md b/.claude/skills/deepwork_jobs.define/SKILL.md deleted file mode 100644 index a6ec07d3..00000000 --- a/.claude/skills/deepwork_jobs.define/SKILL.md +++ /dev/null @@ -1,676 +0,0 @@ ---- -name: deepwork_jobs.define -description: "Creates a job.yml specification by gathering workflow requirements through structured questions. Use when starting a new multi-step workflow." -user-invocable: false - ---- - -# deepwork_jobs.define - -**Step 1/3** in **new_job** workflow - -> Create a new DeepWork job from scratch through definition, review, and implementation - -> Creates and manages multi-step AI workflows. Use when defining, implementing, or improving DeepWork jobs. - - -## Instructions - -**Goal**: Creates a job.yml specification by gathering workflow requirements through structured questions. Use when starting a new multi-step workflow. - -# Define Job Specification - -## Objective - -Create a `job.yml` specification file that defines the structure of a new DeepWork job by thoroughly understanding the user's workflow requirements through an interactive question-and-answer process. - -## Task - -Guide the user through defining a job specification by asking structured questions. **Do not attempt to create the specification without first fully understanding the user's needs.** - -**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. - -The output of this step is **only** the `job.yml` file - a complete specification of the workflow. The actual step instruction files will be created in the next step (`implement`). - -### Step 1: Understand the Job Purpose - -Start by asking structured questions to understand what the user wants to accomplish: - -1. **What is the overall goal of this workflow?** - - What complex task are they trying to accomplish? - - What domain is this in? (e.g., research, marketing, development, reporting) - - How often will they run this workflow? - -2. **What does success look like?** - - What's the final deliverable or outcome? - - Who is the audience for the output? - - What quality criteria matter most? - -3. **What are the major phases?** - - Ask them to describe the workflow at a high level - - What are the distinct stages from start to finish? - - Are there any dependencies between phases? - -### Step 1.5: Detect Document-Oriented Workflows - -**Check for document-focused patterns** in the user's description: -- Keywords: "report", "summary", "document", "create", "monthly", "quarterly", "for stakeholders", "for leadership" -- Final deliverable is a specific document (e.g., "AWS spending report", "competitive analysis", "sprint summary") -- Recurring documents with consistent structure - -**If a document-oriented workflow is detected:** - -1. Inform the user: "This workflow produces a specific document type. I recommend defining a doc spec first to ensure consistent quality." - -2. Ask structured questions to understand if they want to: - - Create a doc spec for this document - - Use an existing doc spec (if any exist in `.deepwork/doc_specs/`) - - Skip doc spec and proceed with simple outputs - -### Step 1.6: Define the Doc Spec (if needed) - -When creating a doc spec, gather the following information: - -1. **Document Identity** - - What is the document called? (e.g., "Monthly AWS Spending Report") - - Brief description of its purpose - - Where should these documents be stored? (path patterns like `finance/aws-reports/*.md`) - -2. **Audience and Context** - - Who reads this document? (target audience) - - How often is it produced? (frequency) - -3. **Quality Criteria** (3-5 criteria, each with name and description) - Examples for a spending report: - - **Visualization**: Must include charts showing spend breakdown by service - - **Variance Analysis**: Must compare current month against previous with percentages - - **Action Items**: Must include recommended cost optimization actions - -4. **Document Structure** - - What sections should it have? - - Any required elements (tables, charts, summaries)? - -### Step 1.7: Create the doc spec File (if needed) - -Create the doc spec file at `.deepwork/doc_specs/[doc_spec_name].md`: - -**Template reference**: See `.deepwork/jobs/deepwork_jobs/templates/doc_spec.md.template` for the standard structure. - -**Complete example**: See `.deepwork/jobs/deepwork_jobs/templates/doc_spec.md.example` for a fully worked example. - -After creating the doc spec, proceed to Step 2 with the doc spec reference for the final step's output. - -### Step 2: Define Each Step - -For each major phase they mentioned, ask structured questions to gather details: - -1. **Step Purpose** - - What exactly does this step accomplish? - - What is the input to this step? - - What is the output from this step? - -2. **Step Inputs** - - What information is needed to start this step? - - Does it need user-provided parameters? (e.g., topic, target audience) - - Does it need files from previous steps? - - What format should inputs be in? - -3. **Step Outputs** - - What files or artifacts does this step produce? - - What format should the output be in? (markdown, YAML, JSON, etc.) - - Where should each output be saved? (filename/path) - - Should outputs be organized in subdirectories? (e.g., `reports/`, `data/`, `drafts/`) - - Will other steps need this output? - - **Important**: Output paths should always be within the main repository directory structure, not in dot-directories like `.deepwork/`. Dot-directories are for configuration and job definitions, not for job outputs. Use paths like `research/competitors/report.md` rather than `.deepwork/outputs/report.md`. - - **Does this output have a doc spec?** If a doc spec was created in Step 1.6/1.7, reference it for the appropriate output - -4. **Step Dependencies** - - Which previous steps must complete before this one? - - Are there any ordering constraints? - -5. **Step Process** (high-level understanding) - - What are the key activities in this step? - - Are there any quality checks or validation needed? - - What makes a good vs. bad output for this step? - -**Note**: You're gathering this information to understand what instructions will be needed, but you won't create the instruction files yet - that happens in the `implement` step. - -#### Doc Spec-Aware Output Format - -When a step produces a document with a doc spec reference, use this format in job.yml: - -```yaml -outputs: - - file: reports/monthly_spending.md - doc_spec: .deepwork/doc_specs/monthly_aws_report.md -``` - -The doc spec's quality criteria will automatically be included in the generated skill, ensuring consistent document quality. - -### Capability Considerations - -When defining steps, identify any that require specialized tools: - -**Browser Automation**: If any step involves web scraping, form filling, interactive browsing, UI testing, or research requiring website visits, ask the user what browser tools they have available. For Claude Code users, **Claude in Chrome** (Anthropic's browser extension) has been tested with DeepWork and is recommended for new users. Don't assume a default—confirm the tool before designing browser-dependent steps. - -### Step 3: Validate the Workflow - -After gathering information about all steps: - -1. **Review the flow** - - Summarize the complete workflow - - Show how outputs from one step feed into the next - - Ask if anything is missing - -2. **Check for gaps** - - Are there any steps where the input isn't clearly defined? - - Are there any outputs that aren't used by later steps? - - Are there circular dependencies? - -3. **Confirm details** - - Job name (lowercase, underscores, descriptive) - - Job summary (one clear sentence, max 200 chars) - - Job description (detailed multi-line explanation) - - Version number (start with 1.0.0) - -### Step 4: Define Quality Validation (Stop Hooks) - -For each step, consider whether it would benefit from **quality validation loops**. Stop hooks allow the AI agent to iteratively refine its work until quality criteria are met. - -**Ask structured questions about quality validation:** -- "Are there specific quality criteria that must be met for this step?" -- "Would you like the agent to validate its work before completing?" -- "What would make you send the work back for revision?" - -**Stop hooks are particularly valuable for:** -- Steps with complex outputs that need multiple checks -- Steps where quality is critical (final deliverables) -- Steps with subjective quality criteria that benefit from AI self-review - -**Three types of stop hooks are supported:** - -1. **Inline Prompt** (`prompt`) - Best for simple quality criteria - ```yaml - stop_hooks: - - prompt: | - Verify the output meets these criteria: - 1. Contains at least 5 competitors - 2. Each competitor has a description - 3. Selection rationale is clear - ``` - -2. **Prompt File** (`prompt_file`) - For detailed/reusable criteria - ```yaml - stop_hooks: - - prompt_file: hooks/quality_check.md - ``` - -3. **Script** (`script`) - For programmatic validation (tests, linting) - ```yaml - stop_hooks: - - script: hooks/run_tests.sh - ``` - -**Multiple hooks can be combined:** -```yaml -stop_hooks: - - script: hooks/lint_output.sh - - prompt: "Verify the content is comprehensive and well-organized" -``` - -**Encourage prompt-based hooks** - They leverage the AI's ability to understand context and make nuanced quality judgments. Script hooks are best for objective checks (syntax, format, tests). - -### Step 5: Create the Job Directory and Specification - -Only after you have complete understanding, create the job directory and `job.yml` file: - -**First, create the directory structure** using the `make_new_job.sh` script: - -```bash -.deepwork/jobs/deepwork_jobs/make_new_job.sh [job_name] -``` - -This creates: -- `.deepwork/jobs/[job_name]/` - Main job directory -- `.deepwork/jobs/[job_name]/steps/` - For step instruction files -- `.deepwork/jobs/[job_name]/hooks/` - For custom validation scripts -- `.deepwork/jobs/[job_name]/templates/` - For example file formats -- `.deepwork/jobs/[job_name]/AGENTS.md` - Job management guidance - -**Then create the job.yml file** at `.deepwork/jobs/[job_name]/job.yml` - -(Where `[job_name]` is the name of the NEW job you're creating, e.g., `competitive_research`) - -**Doc Spec**: See `.deepwork/doc_specs/job_spec.md` for the complete specification with quality criteria. - -**Template reference**: See `.deepwork/jobs/deepwork_jobs/templates/job.yml.template` for the standard structure. - -**Complete example**: See `.deepwork/jobs/deepwork_jobs/templates/job.yml.example` for a fully worked example. - -**Important**: -- Use lowercase with underscores for job name and step IDs -- Ensure file inputs reference steps in dependencies -- No circular dependencies -- At least one output per step -- The `summary` should be concise (max 200 chars) -- The `description` should provide rich context for future refinement -- Include a `changelog` section with an initial entry for version 1.0.0 - -## Example Dialog - -``` -Claude: I'll help you define a new DeepWork job. Let's start by understanding what you're trying to accomplish. - -What complex task or workflow are you trying to create? What's the overall goal? - -User: I want to create competitive research reports for my company - -Claude: Great! Let's dig into that. A few questions: - -1. What's the final deliverable? Is it a written report, a presentation, or something else? -2. Who is the audience? Your team, executives, clients? -3. How comprehensive should these reports be? - -User: It's a written report for our product team. Should be comprehensive - analyzing 3-5 competitors. - -Claude: Perfect. Now let's think about the major phases of creating this report. What are the main stages you go through? For example: -- Identifying which competitors to analyze? -- Gathering information about them? -- Analyzing the data? -- Writing the report? - -Or is your process different? - -User: Yes, that's close. I would say: -1. Identify competitors -2. Research each one -3. Create comparison analysis -4. Write positioning recommendations - -Claude: Excellent! Now let's define each step in detail. Starting with "Identify competitors": - -1. What inputs do you need to start? Do you already have a market segment in mind, or do you need to specify that? -2. What should the output look like? A list of competitor names? More detailed? -3. How many competitors should be identified? - -User: I need to specify the market segment and product category as inputs. The output should be a markdown file with 5-7 competitors and brief descriptions of each. - -Claude: Perfect. A few more questions about that output: -1. What should the file be called? Something like `competitors_list.md`? -2. Should it go in the project root, or in a subdirectory like `research/` or `data/`? - -User: Just call it `competitors_list.md` in the project root. - -Claude: Great. Now for the "Research each one" step: -1. Does it use the competitor list from the first step as input? -2. What kind of research - product features, pricing, marketing, all of the above? -3. What format should the research output be in? -4. Where should that research be saved? - -[continues for each step...] - -[After gathering all information:] - -Claude: Let me summarize the workflow I've designed based on our conversation: - -**Job: competitive_research** -Summary: Systematic competitive analysis workflow for product positioning - -Description: -A comprehensive workflow for analyzing competitors in your market segment. This job helps product teams understand the competitive landscape by systematically identifying competitors, researching their offerings, creating comparison matrices, and developing strategic positioning recommendations. - -The workflow produces: -- A vetted list of key competitors -- Detailed research notes on each competitor -- A comparison matrix highlighting key differentiators -- Strategic positioning recommendations - -Designed for product teams conducting quarterly competitive analysis. - -**Steps:** - -Step 1: identify_competitors -- Input: market_segment (user), product_category (user) -- Output: competitors_list.md (5-7 competitors with descriptions) - -Step 2: research_competitors -- Input: competitors_list.md (from step 1) -- Output: research_notes.md (detailed findings for each competitor) -- Depends on: identify_competitors - -Step 3: comparative_analysis -- Input: research_notes.md (from step 2) -- Output: comparison_matrix.md (side-by-side analysis) -- Depends on: research_competitors - -Step 4: positioning_recommendations -- Input: comparison_matrix.md (from step 3) -- Output: positioning_report.md (strategic recommendations) -- Depends on: comparative_analysis - -Does this accurately capture your workflow? Any changes needed? - -User: Yes, that's perfect! - -Claude: Great! Creating the job.yml specification now... - -[Creates .deepwork/jobs/competitive_research/job.yml with the complete spec] - -✓ Job specification created! - -**File created:** -- .deepwork/jobs/competitive_research/job.yml - -**Next step:** -Run `/deepwork_jobs.review_job_spec` to validate the specification against quality criteria. -``` - -## Important Guidelines - -1. **Focus on specification only** - Don't create instruction files yet -2. **Ask structured questions** - Never skip the discovery phase; use the AskUserQuestion tool -3. **Rich context in description** - This helps with future refinement -4. **Validate understanding** - Summarize and confirm before creating -5. **Use examples** - Help users understand what good specifications look like -6. **Understand file organization** - Always ask structured questions about where outputs should be saved and if subdirectories are needed - -## Validation Rules - -Before creating the job.yml, ensure: -- Job name: lowercase, underscores, no spaces -- Version: semantic versioning (1.0.0) -- Summary: concise, under 200 characters -- Description: detailed, provides context -- Step IDs: unique, descriptive, lowercase with underscores -- Dependencies: must reference existing step IDs -- File inputs: `from_step` must be in dependencies -- At least one output per step -- Outputs can be filenames (e.g., `report.md`) or paths (e.g., `reports/analysis.md`) -- File paths in outputs should match where files will actually be created -- No circular dependencies - -## Output Format - -### job.yml - -The complete YAML specification file (example shown in Step 5 above). - -**Location**: `.deepwork/jobs/[job_name]/job.yml` - -(Where `[job_name]` is the name of the new job being created) - -After creating the file: -1. Inform the user that the specification is complete -2. Recommend that they review the job.yml file -3. Tell them to run `/deepwork_jobs.review_job_spec` next - -## Quality Criteria - -- Asked structured questions to fully understand user requirements -- User fully understands what job they're creating -- All steps have clear inputs and outputs -- Dependencies make logical sense -- Summary is concise and descriptive -- Description provides rich context for future refinement -- Specification is valid YAML and follows the schema -- Ready for implementation step - - -### Job Context - -Core commands for managing DeepWork jobs. These commands help you define new multi-step -workflows and learn from running them. - -The `new_job` workflow guides you through defining and implementing a new job by -asking structured questions about your workflow, understanding each step's inputs and outputs, -reviewing the specification, and generating all necessary files. - -The `learn` skill reflects on conversations where DeepWork jobs were run, identifies -confusion or inefficiencies, and improves job instructions. It also captures bespoke -learnings specific to the current run into AGENTS.md files in the working folder. - - -## Required Inputs - -**User Parameters** - Gather from user before starting: -- **job_purpose**: What complex task or workflow are you trying to accomplish? - - -## Work Branch - -Use branch format: `deepwork/deepwork_jobs-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/deepwork_jobs-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `job.yml` - **Doc Spec**: DeepWork Job Specification - > YAML specification file that defines a multi-step workflow job for AI agents - **Definition**: `.deepwork/doc_specs/job_spec.md` - **Target Audience**: AI agents executing jobs and developers defining workflows - **Quality Criteria**: - 1. **Valid Identifier**: Job name must be lowercase with underscores, no spaces or special characters (e.g., `competitive_research`, `monthly_report`) - 2. **Semantic Version**: Version must follow semantic versioning format X.Y.Z (e.g., `1.0.0`, `2.1.3`) - 3. **Concise Summary**: Summary must be under 200 characters and clearly describe what the job accomplishes - 4. **Rich Description**: Description must be multi-line and explain: the problem solved, the process, expected outcomes, and target users - 5. **Changelog Present**: Must include a changelog array with at least the initial version entry. Changelog should only include one entry per branch at most - 6. **Complete Steps**: Each step must have: id (lowercase_underscores), name, description, instructions_file, outputs (at least one), and dependencies array - 7. **Valid Dependencies**: Dependencies must reference existing step IDs with no circular references - 8. **Input Consistency**: File inputs with `from_step` must reference a step that is in the dependencies array - 9. **Output Paths**: Outputs must be valid filenames or paths within the main repo (not in dot-directories). Use specific, descriptive paths that lend themselves to glob patterns, e.g., `competitive_research/competitors_list.md` or `competitive_research/[competitor_name]/research.md`. Avoid generic names like `output.md`. - 10. **Concise Instructions**: The content of the file, particularly the description, must not have excessively redundant information. It should be concise and to the point given that extra tokens will confuse the AI. - -
- Example Document Structure - - ```markdown - # DeepWork Job Specification: [job_name] - - A `job.yml` file defines a complete multi-step workflow that AI agents can execute. Each job breaks down a complex task into reviewable steps with clear inputs and outputs. - - ## Required Fields - - ### Top-Level Metadata - - ```yaml - name: job_name # lowercase, underscores only - version: "1.0.0" # semantic versioning - summary: "Brief description" # max 200 characters - description: | # detailed multi-line explanation - [Explain what this workflow does, why it exists, - what outputs it produces, and who should use it] - ``` - - ### Changelog - - ```yaml - changelog: - - version: "1.0.0" - changes: "Initial job creation" - - version: "1.1.0" - changes: "Added quality validation hooks" - ``` - - ### Steps Array - - ```yaml - steps: - - id: step_id # unique, lowercase_underscores - name: "Human Readable Name" - description: "What this step accomplishes" - instructions_file: steps/step_id.md - inputs: - # User-provided inputs: - - name: param_name - description: "What the user provides" - # File inputs from previous steps: - - file: output.md - from_step: previous_step_id - outputs: - - competitive_research/competitors_list.md # descriptive path - - competitive_research/[competitor_name]/research.md # parameterized path - # With doc spec reference: - - file: competitive_research/final_report.md - doc_spec: .deepwork/doc_specs/report_type.md - dependencies: - - previous_step_id # steps that must complete first - ``` - - ## Optional Fields - - ### Exposed Steps - - ```yaml - steps: - - id: learn - exposed: true # Makes step available without running dependencies - ``` - - ### Quality Hooks - - ```yaml - steps: - - id: step_id - hooks: - after_agent: - # Inline prompt for quality validation: - - prompt: | - Verify the output meets criteria: - 1. [Criterion 1] - 2. [Criterion 2] - If ALL criteria are met, include `...`. - # External prompt file: - - prompt_file: hooks/quality_check.md - # Script for programmatic validation: - - script: hooks/run_tests.sh - ``` - - ### Stop Hooks (Legacy) - - ```yaml - steps: - - id: step_id - stop_hooks: - - prompt: "Validation prompt..." - - prompt_file: hooks/check.md - - script: hooks/validate.sh - ``` - - ## Validation Rules - - 1. **No circular dependencies**: Step A cannot depend on Step B if Step B depends on Step A - 2. **File inputs require dependencies**: If a step uses `from_step: X`, then X must be in its dependencies - 3. **Unique step IDs**: No two steps can have the same id - 4. **Valid file paths**: Output paths must not contain invalid characters and should be in the main repo (not dot-directories) - 5. **Instructions files exist**: Each `instructions_file` path should have a corresponding file created - - ## Example: Complete Job Specification - - ```yaml - name: competitive_research - version: "1.0.0" - summary: "Systematic competitive analysis workflow" - description: | - A comprehensive workflow for analyzing competitors in your market segment. - Helps product teams understand the competitive landscape through systematic - identification, research, comparison, and positioning recommendations. - - Produces: - - Vetted competitor list - - Research notes per competitor - - Comparison matrix - - Strategic positioning report - - changelog: - - version: "1.0.0" - changes: "Initial job creation" - - steps: - - id: identify_competitors - name: "Identify Competitors" - description: "Identify 5-7 key competitors in the target market" - instructions_file: steps/identify_competitors.md - inputs: - - name: market_segment - description: "The market segment to analyze" - - name: product_category - description: "The product category" - outputs: - - competitive_research/competitors_list.md - dependencies: [] - - - id: research_competitors - name: "Research Competitors" - description: "Deep dive research on each identified competitor" - instructions_file: steps/research_competitors.md - inputs: - - file: competitive_research/competitors_list.md - from_step: identify_competitors - outputs: - - competitive_research/[competitor_name]/research.md - dependencies: - - identify_competitors - - - id: positioning_report - name: "Positioning Report" - description: "Strategic positioning recommendations" - instructions_file: steps/positioning_report.md - inputs: - - file: competitive_research/[competitor_name]/research.md - from_step: research_competitors - outputs: - - file: competitive_research/positioning_report.md - doc_spec: .deepwork/doc_specs/positioning_report.md - dependencies: - - research_competitors - ``` - ``` - -
- -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. **User Understanding**: Did the agent fully understand the user's workflow by asking structured questions? -2. **Structured Questions Used**: Did the agent ask structured questions (using the AskUserQuestion tool) to gather user input? -3. **Document Detection**: For document-oriented workflows, did the agent detect patterns and offer doc spec creation? -4. **doc spec Created (if applicable)**: If a doc spec was needed, was it created in `.deepwork/doc_specs/[doc_spec_name].md` with proper quality criteria? -5. **doc spec References**: Are document outputs properly linked to their doc specs using `{file, doc_spec}` format? -6. **Valid Against doc spec**: Does the job.yml conform to the job.yml doc spec quality criteria (valid identifier, semantic version, concise summary, rich description, complete steps, valid dependencies)? -7. **Clear Inputs/Outputs**: Does every step have clearly defined inputs and outputs? -8. **Logical Dependencies**: Do step dependencies make sense and avoid circular references? -9. **Concise Summary**: Is the summary under 200 characters and descriptive? -10. **Rich Description**: Does the description provide enough context for future refinement? -11. **Valid Schema**: Does the job.yml follow the required schema (name, version, summary, steps)? -12. **File Created**: Has the job.yml file been created in `.deepwork/jobs/[job_name]/job.yml`? -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "new_job step 1/3 complete, outputs: job.yml" -3. **Continue workflow**: Use Skill tool to invoke `/deepwork_jobs.review_job_spec` - ---- - -**Reference files**: `.deepwork/jobs/deepwork_jobs/job.yml`, `.deepwork/jobs/deepwork_jobs/steps/define.md` \ No newline at end of file diff --git a/.claude/skills/deepwork_jobs.implement/SKILL.md b/.claude/skills/deepwork_jobs.implement/SKILL.md deleted file mode 100644 index c9510269..00000000 --- a/.claude/skills/deepwork_jobs.implement/SKILL.md +++ /dev/null @@ -1,335 +0,0 @@ ---- -name: deepwork_jobs.implement -description: "Generates step instruction files and syncs slash commands from the job.yml specification. Use after job spec review passes." -user-invocable: false - ---- - -# deepwork_jobs.implement - -**Step 3/3** in **new_job** workflow - -> Create a new DeepWork job from scratch through definition, review, and implementation - -> Creates and manages multi-step AI workflows. Use when defining, implementing, or improving DeepWork jobs. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/deepwork_jobs.review_job_spec` - -## Instructions - -**Goal**: Generates step instruction files and syncs slash commands from the job.yml specification. Use after job spec review passes. - -# Implement Job Steps - -## Objective - -Generate the DeepWork job directory structure and instruction files for each step based on the validated `job.yml` specification from the review_job_spec step. - -## Task - -Read the `job.yml` specification file and create all the necessary files to make the job functional, including directory structure and step instruction files. Then sync the commands to make them available. - -### Step 1: Create Directory Structure Using Script - -Run the `make_new_job.sh` script to create the standard directory structure: - -```bash -.deepwork/jobs/deepwork_jobs/make_new_job.sh [job_name] -``` - -This creates: -- `.deepwork/jobs/[job_name]/` - Main job directory -- `.deepwork/jobs/[job_name]/steps/` - Step instruction files -- `.deepwork/jobs/[job_name]/hooks/` - Custom validation scripts (with .gitkeep) -- `.deepwork/jobs/[job_name]/templates/` - Example file formats (with .gitkeep) -- `.deepwork/jobs/[job_name]/AGENTS.md` - Job management guidance - -**Note**: If the directory already exists (e.g., job.yml was created by define step), you can skip this step or manually create the additional directories: -```bash -mkdir -p .deepwork/jobs/[job_name]/hooks .deepwork/jobs/[job_name]/templates -touch .deepwork/jobs/[job_name]/hooks/.gitkeep .deepwork/jobs/[job_name]/templates/.gitkeep -``` - -### Step 2: Read and Validate the Specification - -1. **Locate the job.yml file** - - Read `.deepwork/jobs/[job_name]/job.yml` from the review_job_spec step - - Parse the YAML content - -2. **Validate the specification** - - Ensure it follows the schema (name, version, summary, description, steps) - - Check that all dependencies reference existing steps - - Verify no circular dependencies - - Confirm file inputs match dependencies - -3. **Extract key information** - - Job name, version, summary, description - - List of all steps with their details - - Understand the workflow structure - -### Step 3: Generate Step Instruction Files - -For each step in the job.yml, create a comprehensive instruction file at `.deepwork/jobs/[job_name]/steps/[step_id].md`. - -**Template reference**: See `.deepwork/jobs/deepwork_jobs/templates/step_instruction.md.template` for the standard structure. - -**Complete example**: See `.deepwork/jobs/deepwork_jobs/templates/step_instruction.md.example` for a fully worked example. - -**Available templates in `.deepwork/jobs/deepwork_jobs/templates/`:** -- `job.yml.template` - Job specification structure -- `step_instruction.md.template` - Step instruction file structure -- `agents.md.template` - AGENTS.md file structure -- `job.yml.example` - Complete job specification example -- `step_instruction.md.example` - Complete step instruction example - -**Guidelines for generating instructions:** - -1. **Use the job description** - The detailed description from job.yml provides crucial context -2. **Be specific** - Don't write generic instructions; tailor them to the step's purpose -3. **Provide examples** - Show what good output looks like -4. **Explain the "why"** - Help the user understand the step's role in the workflow -5. **Quality over quantity** - Detailed, actionable instructions are better than vague ones -6. **Align with stop hooks** - If the step has `stop_hooks` defined, ensure the quality criteria in the instruction file match the validation criteria in the hooks -7. **Ask structured questions** - When a step has user inputs, the instructions MUST explicitly tell the agent to "ask structured questions" using the AskUserQuestion tool to gather that information. Never use generic phrasing like "ask the user" - always use "ask structured questions" - -### Handling Stop Hooks - -If a step in the job.yml has `stop_hooks` defined, the generated instruction file should: - -1. **Mirror the quality criteria** - The "Quality Criteria" section should match what the stop hooks will validate -2. **Be explicit about success** - Help the agent understand when the step is truly complete -3. **Include the promise pattern** - Mention that `✓ Quality Criteria Met` should be included when criteria are met - -**Example: If the job.yml has:** -```yaml -- id: research_competitors - name: "Research Competitors" - stop_hooks: - - prompt: | - Verify the research meets criteria: - 1. Each competitor has at least 3 data points - 2. Sources are cited - 3. Information is current (within last year) -``` - -**The instruction file should include:** -```markdown -## Quality Criteria - -- Each competitor has at least 3 distinct data points -- All information is sourced with citations -- Data is current (from within the last year) -- When all criteria are met, include `✓ Quality Criteria Met` in your response -``` - -This alignment ensures the AI agent knows exactly what will be validated and can self-check before completing. - -### Using Supplementary Reference Files - -Step instructions can include additional `.md` files in the `steps/` directory for detailed examples, templates, or reference material. Reference them using the full path from the project root. - -See `.deepwork/jobs/deepwork_jobs/steps/supplemental_file_references.md` for detailed documentation and examples. - -### Step 4: Verify job.yml Location - -Verify that `job.yml` is in the correct location at `.deepwork/jobs/[job_name]/job.yml`. The define and review_job_spec steps should have created and validated it. If for some reason it's not there, you may need to create or move it. - -### Step 5: Sync Skills - -Run `deepwork sync` to generate the skills for this job: - -```bash -deepwork sync -``` - -This will: -- Parse the job definition -- Generate skills for each step -- Make the skills available in `.claude/skills/` (or appropriate platform directory) - -### Step 6: Relay Reload Instructions - -After running `deepwork sync`, look at the "To use the new skills" section in the output. **Relay these exact reload instructions to the user** so they know how to pick up the new skills. Don't just reference the sync output - tell them directly what they need to do (e.g., "Type 'exit' then run 'claude --resume'" for Claude Code, or "Run '/memory refresh'" for Gemini CLI). - -### Step 7: Consider Rules for the New Job - -After implementing the job, consider whether there are **rules** that would help enforce quality or consistency when working with this job's domain. - -**What are rules?** - -Rules are automated guardrails stored as markdown files in `.deepwork/rules/` that trigger when certain files change during an AI session. They help ensure: -- Documentation stays in sync with code -- Team guidelines are followed -- Architectural decisions are respected -- Quality standards are maintained - -**When to suggest rules:** - -Think about the job you just implemented and ask: -- Does this job produce outputs that other files depend on? -- Are there documentation files that should be updated when this job's outputs change? -- Are there quality checks or reviews that should happen when certain files in this domain change? -- Could changes to the job's output files impact other parts of the project? - -**Examples of rules that might make sense:** - -| Job Type | Potential Rule | -|----------|----------------| -| API Design | "Update API docs when endpoint definitions change" | -| Database Schema | "Review migrations when schema files change" | -| Competitive Research | "Update strategy docs when competitor analysis changes" | -| Feature Development | "Update changelog when feature files change" | -| Configuration Management | "Update install guide when config files change" | - -**How to offer rule creation:** - -If you identify one or more rules that would benefit the user, explain: -1. **What the rule would do** - What triggers it and what action it prompts -2. **Why it would help** - How it prevents common mistakes or keeps things in sync -3. **What files it would watch** - The trigger patterns - -Then ask the user: - -> "Would you like me to create this rule for you? I can run `/deepwork_rules.define` to set it up." - -If the user agrees, invoke the `/deepwork_rules.define` command to guide them through creating the rule. - -**Example dialogue:** - -``` -Based on the competitive_research job you just created, I noticed that when -competitor analysis files change, it would be helpful to remind you to update -your strategy documentation. - -I'd suggest a rule like: -- **Name**: "Update strategy when competitor analysis changes" -- **Trigger**: `**/positioning_report.md` -- **Action**: Prompt to review and update `docs/strategy.md` - -Would you like me to create this rule? I can run `/deepwork_rules.define` to set it up. -``` - -**Note:** Not every job needs rules. Only suggest them when they would genuinely help maintain consistency or quality. Don't force rules where they don't make sense. - -## Example Implementation - -For a complete worked example showing a job.yml and corresponding step instruction file, see: -- **Job specification**: `.deepwork/jobs/deepwork_jobs/templates/job.yml.example` -- **Step instruction**: `.deepwork/jobs/deepwork_jobs/templates/step_instruction.md.example` - -## Important Guidelines - -1. **Read the spec carefully** - Understand the job's intent from the description -2. **Generate complete instructions** - Don't create placeholder or stub files -3. **Maintain consistency** - Use the same structure for all step instruction files -4. **Provide examples** - Show what good output looks like -5. **Use context** - The job description provides valuable context for each step -6. **Be specific** - Tailor instructions to the specific step, not generic advice - -## Validation Before Sync - -Before running `deepwork sync`, verify: -- All directories exist -- `job.yml` is in place -- All step instruction files exist (one per step) -- No file system errors - -## Completion Checklist - -Before marking this step complete, ensure: -- [ ] job.yml validated and copied to job directory -- [ ] All step instruction files created -- [ ] Each instruction file is complete and actionable -- [ ] `deepwork sync` executed successfully -- [ ] Skills generated in platform directory -- [ ] User informed to follow reload instructions from `deepwork sync` -- [ ] Considered whether rules would benefit this job (Step 7) -- [ ] If rules suggested, offered to run `/deepwork_rules.define` - -## Quality Criteria - -- Job directory structure is correct -- All instruction files are complete (not stubs) -- Instructions are specific and actionable -- Output examples are provided in each instruction file -- Quality criteria defined for each step -- Steps with user inputs explicitly use "ask structured questions" phrasing -- Sync completed successfully -- Skills available for use -- Thoughtfully considered relevant rules for the job domain - - -### Job Context - -Core commands for managing DeepWork jobs. These commands help you define new multi-step -workflows and learn from running them. - -The `new_job` workflow guides you through defining and implementing a new job by -asking structured questions about your workflow, understanding each step's inputs and outputs, -reviewing the specification, and generating all necessary files. - -The `learn` skill reflects on conversations where DeepWork jobs were run, identifies -confusion or inefficiencies, and improves job instructions. It also captures bespoke -learnings specific to the current run into AGENTS.md files in the working folder. - - -## Required Inputs - - -**Files from Previous Steps** - Read these first: -- `job.yml` (from `review_job_spec`) - -## Work Branch - -Use branch format: `deepwork/deepwork_jobs-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/deepwork_jobs-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `steps/` (directory) - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. **Directory Structure**: Is `.deepwork/jobs/[job_name]/` created correctly? -2. **Complete Instructions**: Are ALL step instruction files complete (not stubs or placeholders)? -3. **Specific & Actionable**: Are instructions tailored to each step's purpose, not generic? -4. **Output Examples**: Does each instruction file show what good output looks like? -5. **Quality Criteria**: Does each instruction file define quality criteria for its outputs? -6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"? -7. **Sync Complete**: Has `deepwork sync` been run successfully? -8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`? -9. **Rules Considered**: Has the agent thought about whether rules would benefit this job? If relevant rules were identified, did they explain them and offer to run `/deepwork_rules.define`? Not every job needs rules - only suggest when genuinely helpful. -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "new_job step 3/3 complete, outputs: steps/" -3. **new_job workflow complete**: All steps finished. Consider creating a PR to merge the work branch. - ---- - -**Reference files**: `.deepwork/jobs/deepwork_jobs/job.yml`, `.deepwork/jobs/deepwork_jobs/steps/implement.md` \ No newline at end of file diff --git a/.claude/skills/deepwork_jobs.learn/SKILL.md b/.claude/skills/deepwork_jobs.learn/SKILL.md deleted file mode 100644 index f2db5a56..00000000 --- a/.claude/skills/deepwork_jobs.learn/SKILL.md +++ /dev/null @@ -1,451 +0,0 @@ ---- -name: deepwork_jobs.learn -description: "Analyzes conversation history to improve job instructions and capture learnings. Use after running a job to refine it." - ---- - -# deepwork_jobs.learn - -**Standalone skill** - can be run anytime - -> Creates and manages multi-step AI workflows. Use when defining, implementing, or improving DeepWork jobs. - - -## Instructions - -**Goal**: Analyzes conversation history to improve job instructions and capture learnings. Use after running a job to refine it. - -# Learn from Job Execution - -## Objective - -Think deeply about this task. Reflect on the current conversation to identify learnings from DeepWork job executions, improve job instructions with generalizable insights, and capture bespoke (run-specific) learnings in AGENTS.md files in the deepest common folder that would contain all work on the topic in the future. - -## Task - -Analyze the conversation history to extract learnings and improvements, then apply them appropriately: -- **Generalizable learnings** → Update job instruction files -- **Bespoke learnings** (specific to this run) → Add to AGENTS.md in the deepest common folder for the topic - -### Step 1: Analyze Conversation for Job Executions - -1. **Scan the conversation** for DeepWork slash commands that were run - - Look for patterns like `/job_name.step_id` - - Identify which jobs and steps were executed - - Note the order of execution - -2. **Identify the target folder** - - This should be the deepest common folder that would contain all work on the topic in the future - - Should be clear from conversation history where work was done - - If unclear, run `git diff` to see where changes were made on the branch - -3. **If no job was specified**, ask the user: - - "Which DeepWork job would you like me to learn from?" - - List available jobs from `.deepwork/jobs/` - -### Step 2: Identify Points of Confusion and Inefficiency - -Review the conversation for: - -1. **Confusion signals** - - Questions the agent asked that shouldn't have been necessary - - Misunderstandings about what a step required - - Incorrect outputs that needed correction - - Ambiguous instructions that led to wrong interpretations - -2. **Inefficiency signals** - - Extra steps or iterations that were needed - - Information that had to be repeated - - Context that was missing from instructions - - Dependencies that weren't clear - -3. **Error patterns** - - Failed validations and why they failed - - Quality criteria that were misunderstood - - Edge cases that weren't handled - -4. **Success patterns** - - What worked particularly well - - Efficient approaches worth preserving - - Good examples that could be added to instructions - -### Step 3: Classify Learnings - -For each learning identified, determine if it is: - -**Generalizable** (should improve instructions): -- Would help ANY future run of this job -- Addresses unclear or missing guidance -- Fixes incorrect assumptions in instructions -- Adds helpful examples or context -- Examples: - - "Step instructions should mention that X format is required" - - "Quality criteria should include checking for Y" - - "Add example of correct output format" - -**doc spec-Related** (should improve doc spec files): -- Improvements to document quality criteria -- Changes to document structure or format -- Updated audience or frequency information -- Examples: - - "The report should include a summary table" - - "Quality criterion 'Visualization' needs clearer requirements" - - "Documents need a section for action items" - -**Bespoke** (should go in AGENTS.md): -- Specific to THIS project/codebase/run -- Depends on local conventions or structure -- References specific files or paths -- Would not apply to other uses of this job -- Examples: - - "In this codebase, API endpoints are in `src/api/`" - - "This project uses camelCase for function names" - - "The main config file is at `config/settings.yml`" - -### Step 3.5: Identify doc spec-Related Learnings - -Review the conversation for doc spec-related improvements: - -1. **Quality Criteria Changes** - - Were any quality criteria unclear or insufficient? - - Did the agent repeatedly fail certain criteria? - - Are there new criteria that should be added? - -2. **Document Structure Changes** - - Did the user request different sections? - - Were parts of the document format confusing? - - Should the example document be updated? - -3. **Metadata Updates** - - Has the target audience changed? - - Should frequency or path patterns be updated? - -**Signals for doc spec improvements:** -- User asked for changes to document format -- Repeated validation failures on specific criteria -- Feedback about missing sections or information -- Changes to how documents are organized/stored - -### Step 4: Update Job Instructions (Generalizable Learnings) - -For each generalizable learning: - -1. **Locate the instruction file** - - Path: `.deepwork/jobs/[job_name]/steps/[step_id].md` - -2. **Make targeted improvements** - - Add missing context or clarification - - Include helpful examples - - Clarify ambiguous instructions - - Update quality criteria if needed - -3. **Keep instructions concise** - - Avoid redundancy - don't repeat the same guidance in multiple places - - Be direct - remove verbose explanations that don't add value - - Prefer bullet points over paragraphs where appropriate - -4. **Preserve instruction structure** - - Keep existing sections (Objective, Task, Process, Output Format, Quality Criteria) - - Add to appropriate sections rather than restructuring - - Maintain consistency with other steps - -5. **Track changes for changelog** - - Note what was changed and why - - Prepare changelog entry for job.yml - -### Step 4b: Extract Shared Content into Referenced Files - -Review all instruction files for the job and identify content that: -- Appears in multiple step instructions (duplicated) -- Is lengthy and could be extracted for clarity -- Would benefit from being maintained in one place - -**Extract to shared files:** - -1. **Create shared files** in `.deepwork/jobs/[job_name]/steps/shared/` - - `conventions.md` - Coding/formatting conventions used across steps - - `examples.md` - Common examples referenced by multiple steps - - `schemas.md` - Data structures or formats used throughout - -2. **Reference from instructions** using markdown includes or explicit references: - ```markdown - ## Conventions - - Follow the conventions defined in `shared/conventions.md`. - ``` - -3. **Benefits of extraction:** - - Single source of truth - update once, applies everywhere - - Shorter instruction files - easier to read and maintain - - Consistent guidance across steps - -### Step 4.5: Update doc spec Files (doc spec-Related Learnings) - -If doc spec-related learnings were identified: - -1. **Locate the doc spec file** - - Find doc spec references in job.yml outputs (look for `doc_spec: .deepwork/doc_specs/[doc_spec_name].md`) - - doc spec files are at `.deepwork/doc_specs/[doc_spec_name].md` - -2. **Update quality_criteria array** - - Add new criteria with name and description - - Modify existing criteria descriptions for clarity - - Remove criteria that are no longer relevant - -3. **Update example document** - - Modify the markdown body to reflect structure changes - - Ensure the example matches updated criteria - -4. **Update metadata as needed** - - target_audience: If audience has changed - - frequency: If production cadence has changed - - path_patterns: If storage location has changed - -**Example doc spec update:** -```yaml -# Before -quality_criteria: - - name: Visualization - description: Include charts - -# After -quality_criteria: - - name: Visualization - description: Include Mermaid.js charts showing spend breakdown by service and month-over-month trend -``` - -### Step 5: Create/Update AGENTS.md (Bespoke Learnings) - -The AGENTS.md file captures project-specific knowledge that helps future agent runs. - -1. **Determine the correct location** - - Place AGENTS.md in the deepest common folder that would contain all work on the topic in the future - - This ensures the knowledge is available when working in that context - - If uncertain, place at the project root - -2. **Use file references where possible** - - Instead of duplicating information, reference source files - - This keeps AGENTS.md in sync as the codebase evolves - - Pattern: "See `path/to/file.ext` for [description]" - -3. **AGENTS.md structure**: See `.deepwork/jobs/deepwork_jobs/templates/agents.md.template` for the standard format. - -4. **Writing entries** - - Be concise but specific - - Always prefer file references over inline content - - Use line numbers when referencing specific code: `file.ext:42` - - Group related learnings together - -### Step 6: Update Job Version and Changelog - -If instruction files were modified: - -1. **Bump version in job.yml** - - Patch version (0.0.x) for instruction improvements - - Minor version (0.x.0) if quality criteria changed - -2. **Add changelog entry** - ```yaml - - version: "[new_version]" - changes: "Improved [step] instructions based on execution learnings: [brief description]" - ``` - -### Step 7: Sync and Relay Instructions - -1. **Run deepwork sync** (if instructions were modified) - ```bash - deepwork sync - ``` - -2. **If skills were regenerated**, look at the "To use the new skills" section in the `deepwork sync` output and **relay these exact reload instructions to the user** (e.g., "Type 'exit' then run 'claude --resume'" for Claude Code) - -## File Reference Patterns - -When adding entries to AGENTS.md, prefer these patterns: - -**Good (references)**: -```markdown -- API endpoints follow REST conventions. See `src/api/routes.ts` for examples. -- Error handling pattern: See `src/utils/errors.ts:15-30` -- Configuration schema: Defined in `config/schema.json` -``` - -**Avoid (duplicating)**: -```markdown -- API endpoints should return JSON with this format: { status: ..., data: ... } -- Error handling should use try/catch with custom error classes -- Configuration requires these fields: name, version, ... -``` - -## Quality Criteria - -- Conversation has been analyzed for job executions -- Points of confusion and inefficiency are identified -- Learnings are correctly classified (generalizable vs bespoke) -- Job instructions updated for generalizable improvements -- Instructions are concise - no redundancy or unnecessary verbosity -- Shared/lengthy content extracted into referenced files where appropriate -- AGENTS.md created/updated with bespoke learnings -- File references used instead of duplicating content -- AGENTS.md is in the correct folder (the deepest common folder for the topic) -- When all criteria are met, include `✓ Quality Criteria Met` - -## Example Dialog - -``` -User: /deepwork_jobs.learn - -Claude: I'll analyze this conversation for learnings from DeepWork job executions. - -**Conversation Analysis** - -I found the following job executions: -1. `/competitive_research.identify_competitors` - Completed -2. `/competitive_research.research_competitors` - Completed with 2 iterations -3. `/competitive_research.comparative_analysis` - Completed - -**Identified Issues** - -1. **Confusion in step 2**: The instructions didn't specify that competitor research should include pricing information. This caused an extra iteration. - -2. **Missing context**: The agent had to ask about the industry focus, which should have been carried forward from step 1. - -**Classification** - -- Generalizable: Add pricing to research requirements in instructions -- Bespoke: This project focuses on B2B SaaS in the HR tech space - -**Actions Taken** - -1. Updated `.deepwork/jobs/competitive_research/steps/research_competitors.md`: - - Added "Include pricing information (tiers, models, approximate ranges)" to research requirements - - Added note about carrying forward industry context from previous step - -2. Created `./competitive_research_output/AGENTS.md`: - ```markdown - # Project Context for Competitive Research - - ## Job-Specific Context - - ### competitive_research - - #### Industry Focus - - Domain: B2B SaaS, HR Technology sector - - See `identify_competitors_output.md` for market definition - - #### Competitors Identified - - Reference: `competitor_list.md` for full details - - ## Last Updated - - Date: 2024-01-15 - - From conversation about: Initial competitive analysis run - ``` - -3. Updated job.yml version to 1.0.1 with changelog entry - -4. Ran `deepwork sync` - -**Summary** - -Updated job instructions and created AGENTS.md with bespoke learnings. To get the updated skills, type 'exit' then run 'claude --resume'. -``` - -## Handling Edge Cases - -**No job executions found**: -- Ask user which job they'd like to analyze -- Or offer to review available jobs - -**Multiple jobs executed**: -- Analyze each job separately -- Create separate AGENTS.md entries or files as appropriate - -**AGENTS.md already exists**: -- Read existing content -- Append new learnings to appropriate sections -- Update "Last Updated" section - -**No issues found**: -- Document what worked well -- Consider if any successful patterns should be added to instructions as examples - -**Sensitive information**: -- Never include secrets, credentials, or PII in AGENTS.md -- Reference config files instead of including values - - -### Job Context - -Core commands for managing DeepWork jobs. These commands help you define new multi-step -workflows and learn from running them. - -The `new_job` workflow guides you through defining and implementing a new job by -asking structured questions about your workflow, understanding each step's inputs and outputs, -reviewing the specification, and generating all necessary files. - -The `learn` skill reflects on conversations where DeepWork jobs were run, identifies -confusion or inefficiencies, and improves job instructions. It also captures bespoke -learnings specific to the current run into AGENTS.md files in the working folder. - - -## Required Inputs - -**User Parameters** - Gather from user before starting: -- **job_name**: Name of the job that was run (optional - will auto-detect from conversation) - - -## Work Branch - -Use branch format: `deepwork/deepwork_jobs-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/deepwork_jobs-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `AGENTS.md` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. **Conversation Analyzed**: Did the agent review the conversation for DeepWork job executions? -2. **Confusion Identified**: Did the agent identify points of confusion, errors, or inefficiencies? -3. **Instructions Improved**: Were job instructions updated to address identified issues? -4. **Instructions Concise**: Are instructions free of redundancy and unnecessary verbosity? -5. **Shared Content Extracted**: Is lengthy/duplicated content extracted into referenced files? -6. **doc spec Reviewed (if applicable)**: For jobs with doc spec outputs, were doc spec-related learnings identified? -7. **doc spec Updated (if applicable)**: Were doc spec files updated with improved quality criteria or structure? -8. **Bespoke Learnings Captured**: Were run-specific learnings added to AGENTS.md? -9. **File References Used**: Do AGENTS.md entries reference other files where appropriate? -10. **Working Folder Correct**: Is AGENTS.md in the correct working folder for the job? -11. **Generalizable Separated**: Are generalizable improvements in instructions, not AGENTS.md? -12. **Sync Complete**: Has `deepwork sync` been run if instructions were modified? -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "learn complete, outputs: AGENTS.md" - -This standalone skill can be re-run anytime. - ---- - -**Reference files**: `.deepwork/jobs/deepwork_jobs/job.yml`, `.deepwork/jobs/deepwork_jobs/steps/learn.md` \ No newline at end of file diff --git a/.claude/skills/deepwork_jobs.review_job_spec/SKILL.md b/.claude/skills/deepwork_jobs.review_job_spec/SKILL.md deleted file mode 100644 index 99a565ef..00000000 --- a/.claude/skills/deepwork_jobs.review_job_spec/SKILL.md +++ /dev/null @@ -1,483 +0,0 @@ ---- -name: deepwork_jobs.review_job_spec -description: "Reviews job.yml against quality criteria using a sub-agent for unbiased validation. Use after defining a job specification." -user-invocable: false - ---- - -# deepwork_jobs.review_job_spec - -**Step 2/3** in **new_job** workflow - -> Create a new DeepWork job from scratch through definition, review, and implementation - -> Creates and manages multi-step AI workflows. Use when defining, implementing, or improving DeepWork jobs. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/deepwork_jobs.define` - -## Instructions - -**Goal**: Reviews job.yml against quality criteria using a sub-agent for unbiased validation. Use after defining a job specification. - -# Review Job Specification - -## Objective - -Review the `job.yml` created in the define step against the doc spec quality criteria using a sub-agent for unbiased evaluation, then iterate on fixes until all criteria pass. - -## Why This Step Exists - -The define step focuses on understanding user requirements and creating a job specification. This review step ensures the specification meets quality standards before implementation. Using a sub-agent provides an unbiased "fresh eyes" review that catches issues the main agent might miss after being deeply involved in the definition process. - -## Task - -Use a sub-agent to review the job.yml against all 9 doc spec quality criteria, then fix any failed criteria. Repeat until all criteria pass. - -### Step 1: Read the Job Specification - -Read the `job.yml` file created in the define step: - -``` -.deepwork/jobs/[job_name]/job.yml -``` - -Also read the doc spec for reference: - -``` -.deepwork/doc_specs/job_spec.md -``` - -### Step 2: Spawn Review Sub-Agent - -Use the Task tool to spawn a sub-agent that will provide an unbiased review: - -``` -Task tool parameters: -- subagent_type: "general-purpose" -- model: "haiku" -- description: "Review job.yml against doc spec" -- prompt: [see below] -``` - -**Sub-agent prompt template:** - -``` -Review this job.yml against the following 9 quality criteria from the doc spec. - -For each criterion, respond with: -- PASS or FAIL -- If FAIL: specific issue and suggested fix - -## job.yml Content - -[paste the full job.yml content here] - -## Quality Criteria - -1. **Valid Identifier**: Job name must be lowercase with underscores, no spaces or special characters (e.g., `competitive_research`, `monthly_report`) - -2. **Semantic Version**: Version must follow semantic versioning format X.Y.Z (e.g., `1.0.0`, `2.1.3`) - -3. **Concise Summary**: Summary must be under 200 characters and clearly describe what the job accomplishes - -4. **Rich Description**: Description must be multi-line and explain: the problem solved, the process, expected outcomes, and target users - -5. **Changelog Present**: Must include a changelog array with at least the initial version entry - -6. **Complete Steps**: Each step must have: id (lowercase_underscores), name, description, instructions_file, outputs (at least one), and dependencies array - -7. **Valid Dependencies**: Dependencies must reference existing step IDs with no circular references - -8. **Input Consistency**: File inputs with `from_step` must reference a step that is in the dependencies array - -9. **Output Paths**: Outputs must be valid filenames or paths (e.g., `report.md` or `reports/analysis.md`) - -## Response Format - -Respond with a structured evaluation: - -### Overall: [X/9 PASS] - -### Criterion Results - -1. Valid Identifier: [PASS/FAIL] - [If FAIL: Issue and fix] - -2. Semantic Version: [PASS/FAIL] - [If FAIL: Issue and fix] - -[... continue for all 9 criteria ...] - -### Summary of Required Fixes - -[List any fixes needed, or "No fixes required - all criteria pass"] -``` - -### Step 3: Review Sub-Agent Findings - -Parse the sub-agent's response: - -1. **Count passing criteria** - How many of the 9 criteria passed? -2. **Identify failures** - List specific criteria that failed -3. **Note suggested fixes** - What changes does the sub-agent recommend? - -### Step 4: Fix Failed Criteria - -For each failed criterion, edit the job.yml to address the issue: - -**Common fixes by criterion:** - -| Criterion | Common Issue | Fix | -|-----------|-------------|-----| -| Valid Identifier | Spaces or uppercase | Convert to lowercase_underscores | -| Semantic Version | Missing or invalid format | Set to `"1.0.0"` or fix format | -| Concise Summary | Too long or vague | Shorten to <200 chars, be specific | -| Rich Description | Single line or missing context | Add multi-line explanation with problem/process/outcome/users | -| Changelog Present | Missing changelog | Add `changelog:` with initial version entry | -| Complete Steps | Missing required fields | Add id, name, description, instructions_file, outputs, dependencies | -| Valid Dependencies | Non-existent step or circular | Fix step ID reference or reorder dependencies | -| Input Consistency | from_step not in dependencies | Add the referenced step to dependencies array | -| Output Paths | Invalid characters or format | Use valid filename/path format | - -### Step 5: Re-Run Review (If Needed) - -If any criteria failed: - -1. **Spawn a new sub-agent** with the updated job.yml content -2. **Review the new findings** -3. **Fix any remaining issues** -4. **Repeat until all 9 criteria pass** - -### Step 6: Confirm Completion - -When all 9 criteria pass: - -1. **Announce success**: "All 9 doc spec quality criteria pass." -2. **List what was validated**: - - Valid Identifier - - Semantic Version - - Concise Summary - - Rich Description - - Changelog Present - - Complete Steps - - Valid Dependencies - - Input Consistency - - Output Paths -3. **Include the promise**: `Quality Criteria Met` -4. **Guide to next step**: "Run `/deepwork_jobs.implement` to generate the step instruction files." - -## Example Review Cycle - -``` -Claude: Let me review the job.yml against the doc spec quality criteria. - -[Spawns sub-agent with haiku model] - -Sub-agent response: -### Overall: 7/9 PASS - -### Criterion Results -1. Valid Identifier: PASS -2. Semantic Version: PASS -3. Concise Summary: FAIL - Issue: Summary is 245 characters, exceeds 200 char limit - Fix: Shorten to "Competitive analysis workflow for product positioning" -4. Rich Description: PASS -5. Changelog Present: FAIL - Issue: No changelog array present - Fix: Add changelog with initial version entry -6. Complete Steps: PASS -7. Valid Dependencies: PASS -8. Input Consistency: PASS -9. Output Paths: PASS - -### Summary of Required Fixes -1. Shorten summary to under 200 characters -2. Add changelog array with initial version - -Claude: I found 2 issues. Let me fix them. - -[Edits job.yml to shorten summary and add changelog] - -Claude: Let me verify the fixes with another review. - -[Spawns new sub-agent] - -Sub-agent response: -### Overall: 9/9 PASS - -### Summary of Required Fixes -No fixes required - all criteria pass - -Claude: All 9 doc spec quality criteria pass. - -Quality Criteria Met - -**Next step:** Run `/deepwork_jobs.implement` to generate the step instruction files. -``` - -## Quality Criteria - -- **Sub-Agent Used**: A sub-agent was spawned to provide unbiased review (not just self-review) -- **All doc spec Criteria Evaluated**: The sub-agent assessed all 9 quality criteria from the doc spec -- **Findings Addressed**: All failed criteria were fixed by the main agent -- **Validation Loop Complete**: The review-fix cycle continued until all criteria passed -- **Promise Included**: The response includes `Quality Criteria Met` when complete - -## Output - -The validated `job.yml` file at `.deepwork/jobs/[job_name]/job.yml` that passes all 9 doc spec quality criteria. - - -### Job Context - -Core commands for managing DeepWork jobs. These commands help you define new multi-step -workflows and learn from running them. - -The `new_job` workflow guides you through defining and implementing a new job by -asking structured questions about your workflow, understanding each step's inputs and outputs, -reviewing the specification, and generating all necessary files. - -The `learn` skill reflects on conversations where DeepWork jobs were run, identifies -confusion or inefficiencies, and improves job instructions. It also captures bespoke -learnings specific to the current run into AGENTS.md files in the working folder. - - -## Required Inputs - - -**Files from Previous Steps** - Read these first: -- `job.yml` (from `define`) - -## Work Branch - -Use branch format: `deepwork/deepwork_jobs-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/deepwork_jobs-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `job.yml` - **Doc Spec**: DeepWork Job Specification - > YAML specification file that defines a multi-step workflow job for AI agents - **Definition**: `.deepwork/doc_specs/job_spec.md` - **Target Audience**: AI agents executing jobs and developers defining workflows - **Quality Criteria**: - 1. **Valid Identifier**: Job name must be lowercase with underscores, no spaces or special characters (e.g., `competitive_research`, `monthly_report`) - 2. **Semantic Version**: Version must follow semantic versioning format X.Y.Z (e.g., `1.0.0`, `2.1.3`) - 3. **Concise Summary**: Summary must be under 200 characters and clearly describe what the job accomplishes - 4. **Rich Description**: Description must be multi-line and explain: the problem solved, the process, expected outcomes, and target users - 5. **Changelog Present**: Must include a changelog array with at least the initial version entry. Changelog should only include one entry per branch at most - 6. **Complete Steps**: Each step must have: id (lowercase_underscores), name, description, instructions_file, outputs (at least one), and dependencies array - 7. **Valid Dependencies**: Dependencies must reference existing step IDs with no circular references - 8. **Input Consistency**: File inputs with `from_step` must reference a step that is in the dependencies array - 9. **Output Paths**: Outputs must be valid filenames or paths within the main repo (not in dot-directories). Use specific, descriptive paths that lend themselves to glob patterns, e.g., `competitive_research/competitors_list.md` or `competitive_research/[competitor_name]/research.md`. Avoid generic names like `output.md`. - 10. **Concise Instructions**: The content of the file, particularly the description, must not have excessively redundant information. It should be concise and to the point given that extra tokens will confuse the AI. - -
- Example Document Structure - - ```markdown - # DeepWork Job Specification: [job_name] - - A `job.yml` file defines a complete multi-step workflow that AI agents can execute. Each job breaks down a complex task into reviewable steps with clear inputs and outputs. - - ## Required Fields - - ### Top-Level Metadata - - ```yaml - name: job_name # lowercase, underscores only - version: "1.0.0" # semantic versioning - summary: "Brief description" # max 200 characters - description: | # detailed multi-line explanation - [Explain what this workflow does, why it exists, - what outputs it produces, and who should use it] - ``` - - ### Changelog - - ```yaml - changelog: - - version: "1.0.0" - changes: "Initial job creation" - - version: "1.1.0" - changes: "Added quality validation hooks" - ``` - - ### Steps Array - - ```yaml - steps: - - id: step_id # unique, lowercase_underscores - name: "Human Readable Name" - description: "What this step accomplishes" - instructions_file: steps/step_id.md - inputs: - # User-provided inputs: - - name: param_name - description: "What the user provides" - # File inputs from previous steps: - - file: output.md - from_step: previous_step_id - outputs: - - competitive_research/competitors_list.md # descriptive path - - competitive_research/[competitor_name]/research.md # parameterized path - # With doc spec reference: - - file: competitive_research/final_report.md - doc_spec: .deepwork/doc_specs/report_type.md - dependencies: - - previous_step_id # steps that must complete first - ``` - - ## Optional Fields - - ### Exposed Steps - - ```yaml - steps: - - id: learn - exposed: true # Makes step available without running dependencies - ``` - - ### Quality Hooks - - ```yaml - steps: - - id: step_id - hooks: - after_agent: - # Inline prompt for quality validation: - - prompt: | - Verify the output meets criteria: - 1. [Criterion 1] - 2. [Criterion 2] - If ALL criteria are met, include `...`. - # External prompt file: - - prompt_file: hooks/quality_check.md - # Script for programmatic validation: - - script: hooks/run_tests.sh - ``` - - ### Stop Hooks (Legacy) - - ```yaml - steps: - - id: step_id - stop_hooks: - - prompt: "Validation prompt..." - - prompt_file: hooks/check.md - - script: hooks/validate.sh - ``` - - ## Validation Rules - - 1. **No circular dependencies**: Step A cannot depend on Step B if Step B depends on Step A - 2. **File inputs require dependencies**: If a step uses `from_step: X`, then X must be in its dependencies - 3. **Unique step IDs**: No two steps can have the same id - 4. **Valid file paths**: Output paths must not contain invalid characters and should be in the main repo (not dot-directories) - 5. **Instructions files exist**: Each `instructions_file` path should have a corresponding file created - - ## Example: Complete Job Specification - - ```yaml - name: competitive_research - version: "1.0.0" - summary: "Systematic competitive analysis workflow" - description: | - A comprehensive workflow for analyzing competitors in your market segment. - Helps product teams understand the competitive landscape through systematic - identification, research, comparison, and positioning recommendations. - - Produces: - - Vetted competitor list - - Research notes per competitor - - Comparison matrix - - Strategic positioning report - - changelog: - - version: "1.0.0" - changes: "Initial job creation" - - steps: - - id: identify_competitors - name: "Identify Competitors" - description: "Identify 5-7 key competitors in the target market" - instructions_file: steps/identify_competitors.md - inputs: - - name: market_segment - description: "The market segment to analyze" - - name: product_category - description: "The product category" - outputs: - - competitive_research/competitors_list.md - dependencies: [] - - - id: research_competitors - name: "Research Competitors" - description: "Deep dive research on each identified competitor" - instructions_file: steps/research_competitors.md - inputs: - - file: competitive_research/competitors_list.md - from_step: identify_competitors - outputs: - - competitive_research/[competitor_name]/research.md - dependencies: - - identify_competitors - - - id: positioning_report - name: "Positioning Report" - description: "Strategic positioning recommendations" - instructions_file: steps/positioning_report.md - inputs: - - file: competitive_research/[competitor_name]/research.md - from_step: research_competitors - outputs: - - file: competitive_research/positioning_report.md - doc_spec: .deepwork/doc_specs/positioning_report.md - dependencies: - - research_competitors - ``` - ``` - -
- -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. **Sub-Agent Used**: Was a sub-agent spawned to provide unbiased review? -2. **All doc spec Criteria Evaluated**: Did the sub-agent assess all 9 quality criteria? -3. **Findings Addressed**: Were all failed criteria addressed by the main agent? -4. **Validation Loop Complete**: Did the review-fix cycle continue until all criteria passed? -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "new_job step 2/3 complete, outputs: job.yml" -3. **Continue workflow**: Use Skill tool to invoke `/deepwork_jobs.implement` - ---- - -**Reference files**: `.deepwork/jobs/deepwork_jobs/job.yml`, `.deepwork/jobs/deepwork_jobs/steps/review_job_spec.md` \ No newline at end of file diff --git a/.claude/skills/deepwork_jobs/SKILL.md b/.claude/skills/deepwork_jobs/SKILL.md index ec2526f5..55d7daf2 100644 --- a/.claude/skills/deepwork_jobs/SKILL.md +++ b/.claude/skills/deepwork_jobs/SKILL.md @@ -3,12 +3,10 @@ name: deepwork_jobs description: "Creates and manages multi-step AI workflows. Use when defining, implementing, or improving DeepWork jobs." --- -# deepwork_jobs +# deepwork_jobs Agent Creates and manages multi-step AI workflows. Use when defining, implementing, or improving DeepWork jobs. -> **CRITICAL**: Always invoke steps using the Skill tool. Never copy/paste step instructions directly. - Core commands for managing DeepWork jobs. These commands help you define new multi-step workflows and learn from running them. @@ -21,64 +19,1488 @@ confusion or inefficiencies, and improves job instructions. It also captures bes learnings specific to the current run into AGENTS.md files in the working folder. -## Workflows +## Agent Overview + +This agent handles the **deepwork_jobs** job with 4 skills. + +**Workflows**: new_job +**Standalone Skills**: learn +--- + +## How to Use This Agent + +### Workflows +- **new_job**: Create a new DeepWork job from scratch through definition, review, and implementation (define → review_job_spec → implement) + - Start: `define` + +### Standalone Skills (run anytime) +- **learn**: Analyzes conversation history to improve job instructions and capture learnings. Use after running a job to refine it. + +### All Skills +- `define` - Creates a job.yml specification by gathering workflow requirements through structured questions. Use when starting a new multi-step workflow. +- `review_job_spec` - Reviews job.yml against quality criteria using a sub-agent for unbiased validation. Use after defining a job specification. +- `implement` - Generates step instruction files and syncs slash commands from the job.yml specification. Use after job spec review passes. +- `learn` - Analyzes conversation history to improve job instructions and capture learnings. Use after running a job to refine it. + +--- + +## Agent Execution Instructions + +When invoked, follow these steps: + +### Step 1: Understand Intent + +Parse the user's request to determine: +1. Which workflow or skill to execute +2. Any parameters or context provided +3. Whether this is a continuation of previous work + +### Step 2: Check Work Branch + +Before executing any skill: +1. Check current git branch +2. If on a `deepwork/deepwork_jobs-*` branch: continue using it +3. If on main/master: create new branch `deepwork/deepwork_jobs-[instance]-$(date +%Y%m%d)` + +### Step 3: Execute the Appropriate Skill + +Navigate to the relevant skill section below and follow its instructions. + +### Step 4: Workflow Continuation + +After completing a workflow step: +1. Inform the user of completion and outputs created +2. Automatically proceed to the next step if one exists +3. Continue until the workflow is complete or the user intervenes + +--- + +## Skills + +### Skill: define + +**Type**: Workflow step 1/3 in **new_job** + +**Description**: Creates a job.yml specification by gathering workflow requirements through structured questions. Use when starting a new multi-step workflow. + + +#### Required User Input + +Gather these from the user before starting: +- **job_purpose**: What complex task or workflow are you trying to accomplish? + + +#### Instructions + +# Define Job Specification + +## Objective + +Create a `job.yml` specification file that defines the structure of a new DeepWork job by thoroughly understanding the user's workflow requirements through an interactive question-and-answer process. + +## Task + +Guide the user through defining a job specification by asking structured questions. **Do not attempt to create the specification without first fully understanding the user's needs.** + +**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. + +The output of this step is **only** the `job.yml` file - a complete specification of the workflow. The actual step instruction files will be created in the next step (`implement`). + +### Step 1: Understand the Job Purpose + +Start by asking structured questions to understand what the user wants to accomplish: + +1. **What is the overall goal of this workflow?** + - What complex task are they trying to accomplish? + - What domain is this in? (e.g., research, marketing, development, reporting) + - How often will they run this workflow? + +2. **What does success look like?** + - What's the final deliverable or outcome? + - Who is the audience for the output? + - What quality criteria matter most? + +3. **What are the major phases?** + - Ask them to describe the workflow at a high level + - What are the distinct stages from start to finish? + - Are there any dependencies between phases? + +### Step 1.5: Detect Document-Oriented Workflows + +**Check for document-focused patterns** in the user's description: +- Keywords: "report", "summary", "document", "create", "monthly", "quarterly", "for stakeholders", "for leadership" +- Final deliverable is a specific document (e.g., "AWS spending report", "competitive analysis", "sprint summary") +- Recurring documents with consistent structure + +**If a document-oriented workflow is detected:** + +1. Inform the user: "This workflow produces a specific document type. I recommend defining a doc spec first to ensure consistent quality." + +2. Ask structured questions to understand if they want to: + - Create a doc spec for this document + - Use an existing doc spec (if any exist in `.deepwork/doc_specs/`) + - Skip doc spec and proceed with simple outputs + +### Step 1.6: Define the Doc Spec (if needed) + +When creating a doc spec, gather the following information: + +1. **Document Identity** + - What is the document called? (e.g., "Monthly AWS Spending Report") + - Brief description of its purpose + - Where should these documents be stored? (path patterns like `finance/aws-reports/*.md`) + +2. **Audience and Context** + - Who reads this document? (target audience) + - How often is it produced? (frequency) + +3. **Quality Criteria** (3-5 criteria, each with name and description) + Examples for a spending report: + - **Visualization**: Must include charts showing spend breakdown by service + - **Variance Analysis**: Must compare current month against previous with percentages + - **Action Items**: Must include recommended cost optimization actions + +4. **Document Structure** + - What sections should it have? + - Any required elements (tables, charts, summaries)? + +### Step 1.7: Create the doc spec File (if needed) + +Create the doc spec file at `.deepwork/doc_specs/[doc_spec_name].md`: + +**Template reference**: See `.deepwork/jobs/deepwork_jobs/templates/doc_spec.md.template` for the standard structure. + +**Complete example**: See `.deepwork/jobs/deepwork_jobs/templates/doc_spec.md.example` for a fully worked example. + +After creating the doc spec, proceed to Step 2 with the doc spec reference for the final step's output. + +### Step 2: Define Each Step + +For each major phase they mentioned, ask structured questions to gather details: + +1. **Step Purpose** + - What exactly does this step accomplish? + - What is the input to this step? + - What is the output from this step? + +2. **Step Inputs** + - What information is needed to start this step? + - Does it need user-provided parameters? (e.g., topic, target audience) + - Does it need files from previous steps? + - What format should inputs be in? + +3. **Step Outputs** + - What files or artifacts does this step produce? + - What format should the output be in? (markdown, YAML, JSON, etc.) + - Where should each output be saved? (filename/path) + - Should outputs be organized in subdirectories? (e.g., `reports/`, `data/`, `drafts/`) + - Will other steps need this output? + + **Important**: Output paths should always be within the main repository directory structure, not in dot-directories like `.deepwork/`. Dot-directories are for configuration and job definitions, not for job outputs. Use paths like `research/competitors/report.md` rather than `.deepwork/outputs/report.md`. + - **Does this output have a doc spec?** If a doc spec was created in Step 1.6/1.7, reference it for the appropriate output + +4. **Step Dependencies** + - Which previous steps must complete before this one? + - Are there any ordering constraints? + +5. **Step Process** (high-level understanding) + - What are the key activities in this step? + - Are there any quality checks or validation needed? + - What makes a good vs. bad output for this step? + +**Note**: You're gathering this information to understand what instructions will be needed, but you won't create the instruction files yet - that happens in the `implement` step. + +#### Doc Spec-Aware Output Format + +When a step produces a document with a doc spec reference, use this format in job.yml: + +```yaml +outputs: + - file: reports/monthly_spending.md + doc_spec: .deepwork/doc_specs/monthly_aws_report.md +``` + +The doc spec's quality criteria will automatically be included in the generated skill, ensuring consistent document quality. + +### Capability Considerations + +When defining steps, identify any that require specialized tools: + +**Browser Automation**: If any step involves web scraping, form filling, interactive browsing, UI testing, or research requiring website visits, ask the user what browser tools they have available. For Claude Code users, **Claude in Chrome** (Anthropic's browser extension) has been tested with DeepWork and is recommended for new users. Don't assume a default—confirm the tool before designing browser-dependent steps. + +### Step 3: Validate the Workflow + +After gathering information about all steps: + +1. **Review the flow** + - Summarize the complete workflow + - Show how outputs from one step feed into the next + - Ask if anything is missing + +2. **Check for gaps** + - Are there any steps where the input isn't clearly defined? + - Are there any outputs that aren't used by later steps? + - Are there circular dependencies? + +3. **Confirm details** + - Job name (lowercase, underscores, descriptive) + - Job summary (one clear sentence, max 200 chars) + - Job description (detailed multi-line explanation) + - Version number (start with 1.0.0) + +### Step 4: Define Quality Validation (Stop Hooks) + +For each step, consider whether it would benefit from **quality validation loops**. Stop hooks allow the AI agent to iteratively refine its work until quality criteria are met. + +**Ask structured questions about quality validation:** +- "Are there specific quality criteria that must be met for this step?" +- "Would you like the agent to validate its work before completing?" +- "What would make you send the work back for revision?" + +**Stop hooks are particularly valuable for:** +- Steps with complex outputs that need multiple checks +- Steps where quality is critical (final deliverables) +- Steps with subjective quality criteria that benefit from AI self-review + +**Three types of stop hooks are supported:** + +1. **Inline Prompt** (`prompt`) - Best for simple quality criteria + ```yaml + stop_hooks: + - prompt: | + Verify the output meets these criteria: + 1. Contains at least 5 competitors + 2. Each competitor has a description + 3. Selection rationale is clear + ``` + +2. **Prompt File** (`prompt_file`) - For detailed/reusable criteria + ```yaml + stop_hooks: + - prompt_file: hooks/quality_check.md + ``` + +3. **Script** (`script`) - For programmatic validation (tests, linting) + ```yaml + stop_hooks: + - script: hooks/run_tests.sh + ``` + +**Multiple hooks can be combined:** +```yaml +stop_hooks: + - script: hooks/lint_output.sh + - prompt: "Verify the content is comprehensive and well-organized" +``` + +**Encourage prompt-based hooks** - They leverage the AI's ability to understand context and make nuanced quality judgments. Script hooks are best for objective checks (syntax, format, tests). + +### Step 5: Create the Job Directory and Specification + +Only after you have complete understanding, create the job directory and `job.yml` file: + +**First, create the directory structure** using the `make_new_job.sh` script: + +```bash +.deepwork/jobs/deepwork_jobs/make_new_job.sh [job_name] +``` + +This creates: +- `.deepwork/jobs/[job_name]/` - Main job directory +- `.deepwork/jobs/[job_name]/steps/` - For step instruction files +- `.deepwork/jobs/[job_name]/hooks/` - For custom validation scripts +- `.deepwork/jobs/[job_name]/templates/` - For example file formats +- `.deepwork/jobs/[job_name]/AGENTS.md` - Job management guidance + +**Then create the job.yml file** at `.deepwork/jobs/[job_name]/job.yml` + +(Where `[job_name]` is the name of the NEW job you're creating, e.g., `competitive_research`) + +**Doc Spec**: See `.deepwork/doc_specs/job_spec.md` for the complete specification with quality criteria. + +**Template reference**: See `.deepwork/jobs/deepwork_jobs/templates/job.yml.template` for the standard structure. + +**Complete example**: See `.deepwork/jobs/deepwork_jobs/templates/job.yml.example` for a fully worked example. + +**Important**: +- Use lowercase with underscores for job name and step IDs +- Ensure file inputs reference steps in dependencies +- No circular dependencies +- At least one output per step +- The `summary` should be concise (max 200 chars) +- The `description` should provide rich context for future refinement +- Include a `changelog` section with an initial entry for version 1.0.0 + +## Example Dialog + +``` +Claude: I'll help you define a new DeepWork job. Let's start by understanding what you're trying to accomplish. + +What complex task or workflow are you trying to create? What's the overall goal? + +User: I want to create competitive research reports for my company + +Claude: Great! Let's dig into that. A few questions: + +1. What's the final deliverable? Is it a written report, a presentation, or something else? +2. Who is the audience? Your team, executives, clients? +3. How comprehensive should these reports be? + +User: It's a written report for our product team. Should be comprehensive - analyzing 3-5 competitors. + +Claude: Perfect. Now let's think about the major phases of creating this report. What are the main stages you go through? For example: +- Identifying which competitors to analyze? +- Gathering information about them? +- Analyzing the data? +- Writing the report? + +Or is your process different? + +User: Yes, that's close. I would say: +1. Identify competitors +2. Research each one +3. Create comparison analysis +4. Write positioning recommendations + +Claude: Excellent! Now let's define each step in detail. Starting with "Identify competitors": + +1. What inputs do you need to start? Do you already have a market segment in mind, or do you need to specify that? +2. What should the output look like? A list of competitor names? More detailed? +3. How many competitors should be identified? + +User: I need to specify the market segment and product category as inputs. The output should be a markdown file with 5-7 competitors and brief descriptions of each. + +Claude: Perfect. A few more questions about that output: +1. What should the file be called? Something like `competitors_list.md`? +2. Should it go in the project root, or in a subdirectory like `research/` or `data/`? + +User: Just call it `competitors_list.md` in the project root. + +Claude: Great. Now for the "Research each one" step: +1. Does it use the competitor list from the first step as input? +2. What kind of research - product features, pricing, marketing, all of the above? +3. What format should the research output be in? +4. Where should that research be saved? + +[continues for each step...] + +[After gathering all information:] + +Claude: Let me summarize the workflow I've designed based on our conversation: + +**Job: competitive_research** +Summary: Systematic competitive analysis workflow for product positioning + +Description: +A comprehensive workflow for analyzing competitors in your market segment. This job helps product teams understand the competitive landscape by systematically identifying competitors, researching their offerings, creating comparison matrices, and developing strategic positioning recommendations. + +The workflow produces: +- A vetted list of key competitors +- Detailed research notes on each competitor +- A comparison matrix highlighting key differentiators +- Strategic positioning recommendations + +Designed for product teams conducting quarterly competitive analysis. + +**Steps:** + +Step 1: identify_competitors +- Input: market_segment (user), product_category (user) +- Output: competitors_list.md (5-7 competitors with descriptions) + +Step 2: research_competitors +- Input: competitors_list.md (from step 1) +- Output: research_notes.md (detailed findings for each competitor) +- Depends on: identify_competitors + +Step 3: comparative_analysis +- Input: research_notes.md (from step 2) +- Output: comparison_matrix.md (side-by-side analysis) +- Depends on: research_competitors + +Step 4: positioning_recommendations +- Input: comparison_matrix.md (from step 3) +- Output: positioning_report.md (strategic recommendations) +- Depends on: comparative_analysis + +Does this accurately capture your workflow? Any changes needed? + +User: Yes, that's perfect! + +Claude: Great! Creating the job.yml specification now... + +[Creates .deepwork/jobs/competitive_research/job.yml with the complete spec] + +✓ Job specification created! + +**File created:** +- .deepwork/jobs/competitive_research/job.yml + +**Next step:** +Run `/deepwork_jobs.review_job_spec` to validate the specification against quality criteria. +``` + +## Important Guidelines + +1. **Focus on specification only** - Don't create instruction files yet +2. **Ask structured questions** - Never skip the discovery phase; use the AskUserQuestion tool +3. **Rich context in description** - This helps with future refinement +4. **Validate understanding** - Summarize and confirm before creating +5. **Use examples** - Help users understand what good specifications look like +6. **Understand file organization** - Always ask structured questions about where outputs should be saved and if subdirectories are needed + +## Validation Rules + +Before creating the job.yml, ensure: +- Job name: lowercase, underscores, no spaces +- Version: semantic versioning (1.0.0) +- Summary: concise, under 200 characters +- Description: detailed, provides context +- Step IDs: unique, descriptive, lowercase with underscores +- Dependencies: must reference existing step IDs +- File inputs: `from_step` must be in dependencies +- At least one output per step +- Outputs can be filenames (e.g., `report.md`) or paths (e.g., `reports/analysis.md`) +- File paths in outputs should match where files will actually be created +- No circular dependencies + +## Output Format + +### job.yml + +The complete YAML specification file (example shown in Step 5 above). + +**Location**: `.deepwork/jobs/[job_name]/job.yml` + +(Where `[job_name]` is the name of the new job being created) + +After creating the file: +1. Inform the user that the specification is complete +2. Recommend that they review the job.yml file +3. Tell them to run `/deepwork_jobs.review_job_spec` next + +## Quality Criteria + +- Asked structured questions to fully understand user requirements +- User fully understands what job they're creating +- All steps have clear inputs and outputs +- Dependencies make logical sense +- Summary is concise and descriptive +- Description provides rich context for future refinement +- Specification is valid YAML and follows the schema +- Ready for implementation step + + +#### Outputs + +Create these files/directories: +- `job.yml` + **Doc Spec**: DeepWork Job Specification + > YAML specification file that defines a multi-step workflow job for AI agents + **Quality Criteria**: + - **Valid Identifier**: Job name must be lowercase with underscores, no spaces or special characters (e.g., `competitive_research`, `monthly_report`) + - **Semantic Version**: Version must follow semantic versioning format X.Y.Z (e.g., `1.0.0`, `2.1.3`) + - **Concise Summary**: Summary must be under 200 characters and clearly describe what the job accomplishes + - **Rich Description**: Description must be multi-line and explain: the problem solved, the process, expected outcomes, and target users + - **Changelog Present**: Must include a changelog array with at least the initial version entry. Changelog should only include one entry per branch at most + - **Complete Steps**: Each step must have: id (lowercase_underscores), name, description, instructions_file, outputs (at least one), and dependencies array + - **Valid Dependencies**: Dependencies must reference existing step IDs with no circular references + - **Input Consistency**: File inputs with `from_step` must reference a step that is in the dependencies array + - **Output Paths**: Outputs must be valid filenames or paths within the main repo (not in dot-directories). Use specific, descriptive paths that lend themselves to glob patterns, e.g., `competitive_research/competitors_list.md` or `competitive_research/[competitor_name]/research.md`. Avoid generic names like `output.md`. + - **Concise Instructions**: The content of the file, particularly the description, must not have excessively redundant information. It should be concise and to the point given that extra tokens will confuse the AI. + +#### Quality Validation + +Before completing this skill, verify: +1. **User Understanding**: Did the agent fully understand the user's workflow by asking structured questions? +2. **Structured Questions Used**: Did the agent ask structured questions (using the AskUserQuestion tool) to gather user input? +3. **Document Detection**: For document-oriented workflows, did the agent detect patterns and offer doc spec creation? +4. **doc spec Created (if applicable)**: If a doc spec was needed, was it created in `.deepwork/doc_specs/[doc_spec_name].md` with proper quality criteria? +5. **doc spec References**: Are document outputs properly linked to their doc specs using `{file, doc_spec}` format? +6. **Valid Against doc spec**: Does the job.yml conform to the job.yml doc spec quality criteria (valid identifier, semantic version, concise summary, rich description, complete steps, valid dependencies)? +7. **Clear Inputs/Outputs**: Does every step have clearly defined inputs and outputs? +8. **Logical Dependencies**: Do step dependencies make sense and avoid circular references? +9. **Concise Summary**: Is the summary under 200 characters and descriptive? +10. **Rich Description**: Does the description provide enough context for future refinement? +11. **Valid Schema**: Does the job.yml follow the required schema (name, version, summary, steps)? +12. **File Created**: Has the job.yml file been created in `.deepwork/jobs/[job_name]/job.yml`? + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. + +#### On Completion + +1. Verify outputs are created +2. Inform user: "define complete, outputs: job.yml" +3. **Continue to next skill**: Proceed to `review_job_spec` + +--- + +### Skill: review_job_spec + +**Type**: Workflow step 2/3 in **new_job** + +**Description**: Reviews job.yml against quality criteria using a sub-agent for unbiased validation. Use after defining a job specification. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `define` + + +#### Input Files + +Read these files (from previous steps): +- `job.yml` (from `define`) + +#### Instructions + +# Review Job Specification + +## Objective + +Review the `job.yml` created in the define step against the doc spec quality criteria using a sub-agent for unbiased evaluation, then iterate on fixes until all criteria pass. + +## Why This Step Exists + +The define step focuses on understanding user requirements and creating a job specification. This review step ensures the specification meets quality standards before implementation. Using a sub-agent provides an unbiased "fresh eyes" review that catches issues the main agent might miss after being deeply involved in the definition process. + +## Task + +Use a sub-agent to review the job.yml against all 9 doc spec quality criteria, then fix any failed criteria. Repeat until all criteria pass. + +### Step 1: Read the Job Specification + +Read the `job.yml` file created in the define step: + +``` +.deepwork/jobs/[job_name]/job.yml +``` + +Also read the doc spec for reference: + +``` +.deepwork/doc_specs/job_spec.md +``` + +### Step 2: Spawn Review Sub-Agent + +Use the Task tool to spawn a sub-agent that will provide an unbiased review: + +``` +Task tool parameters: +- subagent_type: "general-purpose" +- model: "haiku" +- description: "Review job.yml against doc spec" +- prompt: [see below] +``` + +**Sub-agent prompt template:** + +``` +Review this job.yml against the following 9 quality criteria from the doc spec. + +For each criterion, respond with: +- PASS or FAIL +- If FAIL: specific issue and suggested fix + +## job.yml Content + +[paste the full job.yml content here] + +## Quality Criteria + +1. **Valid Identifier**: Job name must be lowercase with underscores, no spaces or special characters (e.g., `competitive_research`, `monthly_report`) + +2. **Semantic Version**: Version must follow semantic versioning format X.Y.Z (e.g., `1.0.0`, `2.1.3`) + +3. **Concise Summary**: Summary must be under 200 characters and clearly describe what the job accomplishes + +4. **Rich Description**: Description must be multi-line and explain: the problem solved, the process, expected outcomes, and target users + +5. **Changelog Present**: Must include a changelog array with at least the initial version entry + +6. **Complete Steps**: Each step must have: id (lowercase_underscores), name, description, instructions_file, outputs (at least one), and dependencies array + +7. **Valid Dependencies**: Dependencies must reference existing step IDs with no circular references + +8. **Input Consistency**: File inputs with `from_step` must reference a step that is in the dependencies array + +9. **Output Paths**: Outputs must be valid filenames or paths (e.g., `report.md` or `reports/analysis.md`) + +## Response Format + +Respond with a structured evaluation: + +### Overall: [X/9 PASS] + +### Criterion Results + +1. Valid Identifier: [PASS/FAIL] + [If FAIL: Issue and fix] + +2. Semantic Version: [PASS/FAIL] + [If FAIL: Issue and fix] + +[... continue for all 9 criteria ...] + +### Summary of Required Fixes + +[List any fixes needed, or "No fixes required - all criteria pass"] +``` + +### Step 3: Review Sub-Agent Findings + +Parse the sub-agent's response: + +1. **Count passing criteria** - How many of the 9 criteria passed? +2. **Identify failures** - List specific criteria that failed +3. **Note suggested fixes** - What changes does the sub-agent recommend? + +### Step 4: Fix Failed Criteria + +For each failed criterion, edit the job.yml to address the issue: + +**Common fixes by criterion:** + +| Criterion | Common Issue | Fix | +|-----------|-------------|-----| +| Valid Identifier | Spaces or uppercase | Convert to lowercase_underscores | +| Semantic Version | Missing or invalid format | Set to `"1.0.0"` or fix format | +| Concise Summary | Too long or vague | Shorten to <200 chars, be specific | +| Rich Description | Single line or missing context | Add multi-line explanation with problem/process/outcome/users | +| Changelog Present | Missing changelog | Add `changelog:` with initial version entry | +| Complete Steps | Missing required fields | Add id, name, description, instructions_file, outputs, dependencies | +| Valid Dependencies | Non-existent step or circular | Fix step ID reference or reorder dependencies | +| Input Consistency | from_step not in dependencies | Add the referenced step to dependencies array | +| Output Paths | Invalid characters or format | Use valid filename/path format | -### new_job +### Step 5: Re-Run Review (If Needed) -Create a new DeepWork job from scratch through definition, review, and implementation +If any criteria failed: -**Steps in order**: -1. **define** - Creates a job.yml specification by gathering workflow requirements through structured questions. Use when starting a new multi-step workflow. -2. **review_job_spec** - Reviews job.yml against quality criteria using a sub-agent for unbiased validation. Use after defining a job specification. -3. **implement** - Generates step instruction files and syncs slash commands from the job.yml specification. Use after job spec review passes. +1. **Spawn a new sub-agent** with the updated job.yml content +2. **Review the new findings** +3. **Fix any remaining issues** +4. **Repeat until all 9 criteria pass** -**Start workflow**: `/deepwork_jobs.define` +### Step 6: Confirm Completion -## Standalone Skills +When all 9 criteria pass: -These skills can be run independently at any time: +1. **Announce success**: "All 9 doc spec quality criteria pass." +2. **List what was validated**: + - Valid Identifier + - Semantic Version + - Concise Summary + - Rich Description + - Changelog Present + - Complete Steps + - Valid Dependencies + - Input Consistency + - Output Paths +3. **Include the promise**: `Quality Criteria Met` +4. **Guide to next step**: "Run `/deepwork_jobs.implement` to generate the step instruction files." -- **learn** - Analyzes conversation history to improve job instructions and capture learnings. Use after running a job to refine it. - Command: `/deepwork_jobs.learn` +## Example Review Cycle + +``` +Claude: Let me review the job.yml against the doc spec quality criteria. + +[Spawns sub-agent with haiku model] + +Sub-agent response: +### Overall: 7/9 PASS + +### Criterion Results +1. Valid Identifier: PASS +2. Semantic Version: PASS +3. Concise Summary: FAIL + Issue: Summary is 245 characters, exceeds 200 char limit + Fix: Shorten to "Competitive analysis workflow for product positioning" +4. Rich Description: PASS +5. Changelog Present: FAIL + Issue: No changelog array present + Fix: Add changelog with initial version entry +6. Complete Steps: PASS +7. Valid Dependencies: PASS +8. Input Consistency: PASS +9. Output Paths: PASS + +### Summary of Required Fixes +1. Shorten summary to under 200 characters +2. Add changelog array with initial version + +Claude: I found 2 issues. Let me fix them. + +[Edits job.yml to shorten summary and add changelog] + +Claude: Let me verify the fixes with another review. + +[Spawns new sub-agent] + +Sub-agent response: +### Overall: 9/9 PASS + +### Summary of Required Fixes +No fixes required - all criteria pass + +Claude: All 9 doc spec quality criteria pass. + +Quality Criteria Met + +**Next step:** Run `/deepwork_jobs.implement` to generate the step instruction files. +``` + +## Quality Criteria + +- **Sub-Agent Used**: A sub-agent was spawned to provide unbiased review (not just self-review) +- **All doc spec Criteria Evaluated**: The sub-agent assessed all 9 quality criteria from the doc spec +- **Findings Addressed**: All failed criteria were fixed by the main agent +- **Validation Loop Complete**: The review-fix cycle continued until all criteria passed +- **Promise Included**: The response includes `Quality Criteria Met` when complete + +## Output + +The validated `job.yml` file at `.deepwork/jobs/[job_name]/job.yml` that passes all 9 doc spec quality criteria. + + +#### Outputs + +Create these files/directories: +- `job.yml` + **Doc Spec**: DeepWork Job Specification + > YAML specification file that defines a multi-step workflow job for AI agents + **Quality Criteria**: + - **Valid Identifier**: Job name must be lowercase with underscores, no spaces or special characters (e.g., `competitive_research`, `monthly_report`) + - **Semantic Version**: Version must follow semantic versioning format X.Y.Z (e.g., `1.0.0`, `2.1.3`) + - **Concise Summary**: Summary must be under 200 characters and clearly describe what the job accomplishes + - **Rich Description**: Description must be multi-line and explain: the problem solved, the process, expected outcomes, and target users + - **Changelog Present**: Must include a changelog array with at least the initial version entry. Changelog should only include one entry per branch at most + - **Complete Steps**: Each step must have: id (lowercase_underscores), name, description, instructions_file, outputs (at least one), and dependencies array + - **Valid Dependencies**: Dependencies must reference existing step IDs with no circular references + - **Input Consistency**: File inputs with `from_step` must reference a step that is in the dependencies array + - **Output Paths**: Outputs must be valid filenames or paths within the main repo (not in dot-directories). Use specific, descriptive paths that lend themselves to glob patterns, e.g., `competitive_research/competitors_list.md` or `competitive_research/[competitor_name]/research.md`. Avoid generic names like `output.md`. + - **Concise Instructions**: The content of the file, particularly the description, must not have excessively redundant information. It should be concise and to the point given that extra tokens will confuse the AI. + +#### Quality Validation + +Before completing this skill, verify: +1. **Sub-Agent Used**: Was a sub-agent spawned to provide unbiased review? +2. **All doc spec Criteria Evaluated**: Did the sub-agent assess all 9 quality criteria? +3. **Findings Addressed**: Were all failed criteria addressed by the main agent? +4. **Validation Loop Complete**: Did the review-fix cycle continue until all criteria passed? + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. + +#### On Completion + +1. Verify outputs are created +2. Inform user: "review_job_spec complete, outputs: job.yml" +3. **Continue to next skill**: Proceed to `implement` + +--- + +### Skill: implement + +**Type**: Workflow step 3/3 in **new_job** + +**Description**: Generates step instruction files and syncs slash commands from the job.yml specification. Use after job spec review passes. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `review_job_spec` + + +#### Input Files + +Read these files (from previous steps): +- `job.yml` (from `review_job_spec`) + +#### Instructions + +# Implement Job Steps + +## Objective + +Generate the DeepWork job directory structure and instruction files for each step based on the validated `job.yml` specification from the review_job_spec step. + +## Task + +Read the `job.yml` specification file and create all the necessary files to make the job functional, including directory structure and step instruction files. Then sync the commands to make them available. + +### Step 1: Create Directory Structure Using Script + +Run the `make_new_job.sh` script to create the standard directory structure: + +```bash +.deepwork/jobs/deepwork_jobs/make_new_job.sh [job_name] +``` + +This creates: +- `.deepwork/jobs/[job_name]/` - Main job directory +- `.deepwork/jobs/[job_name]/steps/` - Step instruction files +- `.deepwork/jobs/[job_name]/hooks/` - Custom validation scripts (with .gitkeep) +- `.deepwork/jobs/[job_name]/templates/` - Example file formats (with .gitkeep) +- `.deepwork/jobs/[job_name]/AGENTS.md` - Job management guidance + +**Note**: If the directory already exists (e.g., job.yml was created by define step), you can skip this step or manually create the additional directories: +```bash +mkdir -p .deepwork/jobs/[job_name]/hooks .deepwork/jobs/[job_name]/templates +touch .deepwork/jobs/[job_name]/hooks/.gitkeep .deepwork/jobs/[job_name]/templates/.gitkeep +``` + +### Step 2: Read and Validate the Specification + +1. **Locate the job.yml file** + - Read `.deepwork/jobs/[job_name]/job.yml` from the review_job_spec step + - Parse the YAML content + +2. **Validate the specification** + - Ensure it follows the schema (name, version, summary, description, steps) + - Check that all dependencies reference existing steps + - Verify no circular dependencies + - Confirm file inputs match dependencies + +3. **Extract key information** + - Job name, version, summary, description + - List of all steps with their details + - Understand the workflow structure + +### Step 3: Generate Step Instruction Files + +For each step in the job.yml, create a comprehensive instruction file at `.deepwork/jobs/[job_name]/steps/[step_id].md`. + +**Template reference**: See `.deepwork/jobs/deepwork_jobs/templates/step_instruction.md.template` for the standard structure. + +**Complete example**: See `.deepwork/jobs/deepwork_jobs/templates/step_instruction.md.example` for a fully worked example. + +**Available templates in `.deepwork/jobs/deepwork_jobs/templates/`:** +- `job.yml.template` - Job specification structure +- `step_instruction.md.template` - Step instruction file structure +- `agents.md.template` - AGENTS.md file structure +- `job.yml.example` - Complete job specification example +- `step_instruction.md.example` - Complete step instruction example + +**Guidelines for generating instructions:** + +1. **Use the job description** - The detailed description from job.yml provides crucial context +2. **Be specific** - Don't write generic instructions; tailor them to the step's purpose +3. **Provide examples** - Show what good output looks like +4. **Explain the "why"** - Help the user understand the step's role in the workflow +5. **Quality over quantity** - Detailed, actionable instructions are better than vague ones +6. **Align with stop hooks** - If the step has `stop_hooks` defined, ensure the quality criteria in the instruction file match the validation criteria in the hooks +7. **Ask structured questions** - When a step has user inputs, the instructions MUST explicitly tell the agent to "ask structured questions" using the AskUserQuestion tool to gather that information. Never use generic phrasing like "ask the user" - always use "ask structured questions" + +### Handling Stop Hooks + +If a step in the job.yml has `stop_hooks` defined, the generated instruction file should: + +1. **Mirror the quality criteria** - The "Quality Criteria" section should match what the stop hooks will validate +2. **Be explicit about success** - Help the agent understand when the step is truly complete +3. **Include the promise pattern** - Mention that `✓ Quality Criteria Met` should be included when criteria are met + +**Example: If the job.yml has:** +```yaml +- id: research_competitors + name: "Research Competitors" + stop_hooks: + - prompt: | + Verify the research meets criteria: + 1. Each competitor has at least 3 data points + 2. Sources are cited + 3. Information is current (within last year) +``` + +**The instruction file should include:** +```markdown +## Quality Criteria + +- Each competitor has at least 3 distinct data points +- All information is sourced with citations +- Data is current (from within the last year) +- When all criteria are met, include `✓ Quality Criteria Met` in your response +``` + +This alignment ensures the AI agent knows exactly what will be validated and can self-check before completing. + +### Using Supplementary Reference Files + +Step instructions can include additional `.md` files in the `steps/` directory for detailed examples, templates, or reference material. Reference them using the full path from the project root. + +See `.deepwork/jobs/deepwork_jobs/steps/supplemental_file_references.md` for detailed documentation and examples. + +### Step 4: Verify job.yml Location + +Verify that `job.yml` is in the correct location at `.deepwork/jobs/[job_name]/job.yml`. The define and review_job_spec steps should have created and validated it. If for some reason it's not there, you may need to create or move it. + +### Step 5: Sync Skills + +Run `deepwork sync` to generate the skills for this job: + +```bash +deepwork sync +``` + +This will: +- Parse the job definition +- Generate skills for each step +- Make the skills available in `.claude/skills/` (or appropriate platform directory) + +### Step 6: Relay Reload Instructions + +After running `deepwork sync`, look at the "To use the new skills" section in the output. **Relay these exact reload instructions to the user** so they know how to pick up the new skills. Don't just reference the sync output - tell them directly what they need to do (e.g., "Type 'exit' then run 'claude --resume'" for Claude Code, or "Run '/memory refresh'" for Gemini CLI). + +### Step 7: Consider Rules for the New Job + +After implementing the job, consider whether there are **rules** that would help enforce quality or consistency when working with this job's domain. + +**What are rules?** + +Rules are automated guardrails stored as markdown files in `.deepwork/rules/` that trigger when certain files change during an AI session. They help ensure: +- Documentation stays in sync with code +- Team guidelines are followed +- Architectural decisions are respected +- Quality standards are maintained + +**When to suggest rules:** + +Think about the job you just implemented and ask: +- Does this job produce outputs that other files depend on? +- Are there documentation files that should be updated when this job's outputs change? +- Are there quality checks or reviews that should happen when certain files in this domain change? +- Could changes to the job's output files impact other parts of the project? + +**Examples of rules that might make sense:** + +| Job Type | Potential Rule | +|----------|----------------| +| API Design | "Update API docs when endpoint definitions change" | +| Database Schema | "Review migrations when schema files change" | +| Competitive Research | "Update strategy docs when competitor analysis changes" | +| Feature Development | "Update changelog when feature files change" | +| Configuration Management | "Update install guide when config files change" | + +**How to offer rule creation:** + +If you identify one or more rules that would benefit the user, explain: +1. **What the rule would do** - What triggers it and what action it prompts +2. **Why it would help** - How it prevents common mistakes or keeps things in sync +3. **What files it would watch** - The trigger patterns + +Then ask the user: + +> "Would you like me to create this rule for you? I can run `/deepwork_rules.define` to set it up." + +If the user agrees, invoke the `/deepwork_rules.define` command to guide them through creating the rule. + +**Example dialogue:** + +``` +Based on the competitive_research job you just created, I noticed that when +competitor analysis files change, it would be helpful to remind you to update +your strategy documentation. + +I'd suggest a rule like: +- **Name**: "Update strategy when competitor analysis changes" +- **Trigger**: `**/positioning_report.md` +- **Action**: Prompt to review and update `docs/strategy.md` + +Would you like me to create this rule? I can run `/deepwork_rules.define` to set it up. +``` + +**Note:** Not every job needs rules. Only suggest them when they would genuinely help maintain consistency or quality. Don't force rules where they don't make sense. + +## Example Implementation + +For a complete worked example showing a job.yml and corresponding step instruction file, see: +- **Job specification**: `.deepwork/jobs/deepwork_jobs/templates/job.yml.example` +- **Step instruction**: `.deepwork/jobs/deepwork_jobs/templates/step_instruction.md.example` + +## Important Guidelines + +1. **Read the spec carefully** - Understand the job's intent from the description +2. **Generate complete instructions** - Don't create placeholder or stub files +3. **Maintain consistency** - Use the same structure for all step instruction files +4. **Provide examples** - Show what good output looks like +5. **Use context** - The job description provides valuable context for each step +6. **Be specific** - Tailor instructions to the specific step, not generic advice + +## Validation Before Sync + +Before running `deepwork sync`, verify: +- All directories exist +- `job.yml` is in place +- All step instruction files exist (one per step) +- No file system errors + +## Completion Checklist + +Before marking this step complete, ensure: +- [ ] job.yml validated and copied to job directory +- [ ] All step instruction files created +- [ ] Each instruction file is complete and actionable +- [ ] `deepwork sync` executed successfully +- [ ] Skills generated in platform directory +- [ ] User informed to follow reload instructions from `deepwork sync` +- [ ] Considered whether rules would benefit this job (Step 7) +- [ ] If rules suggested, offered to run `/deepwork_rules.define` + +## Quality Criteria + +- Job directory structure is correct +- All instruction files are complete (not stubs) +- Instructions are specific and actionable +- Output examples are provided in each instruction file +- Quality criteria defined for each step +- Steps with user inputs explicitly use "ask structured questions" phrasing +- Sync completed successfully +- Skills available for use +- Thoughtfully considered relevant rules for the job domain -## Execution Instructions +#### Outputs -### Step 1: Analyze Intent +Create these files/directories: +- `steps/` (directory) +#### Quality Validation -Parse any text following `/deepwork_jobs` to determine user intent: -- "new_job" or related terms → start new_job workflow at `deepwork_jobs.define` -- "learn" or related terms → run standalone skill `deepwork_jobs.learn` +Before completing this skill, verify: +1. **Directory Structure**: Is `.deepwork/jobs/[job_name]/` created correctly? +2. **Complete Instructions**: Are ALL step instruction files complete (not stubs or placeholders)? +3. **Specific & Actionable**: Are instructions tailored to each step's purpose, not generic? +4. **Output Examples**: Does each instruction file show what good output looks like? +5. **Quality Criteria**: Does each instruction file define quality criteria for its outputs? +6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"? +7. **Sync Complete**: Has `deepwork sync` been run successfully? +8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`? +9. **Rules Considered**: Has the agent thought about whether rules would benefit this job? If relevant rules were identified, did they explain them and offer to run `/deepwork_rules.define`? Not every job needs rules - only suggest when genuinely helpful. -### Step 2: Invoke Starting Step +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. -Use the Skill tool to invoke the identified starting step: +#### On Completion + +1. Verify outputs are created +2. Inform user: "new_job workflow complete, outputs: steps/" +3. Consider creating a PR to merge the work branch + +--- + +### Skill: learn + +**Type**: Standalone (can be run anytime) + +**Description**: Analyzes conversation history to improve job instructions and capture learnings. Use after running a job to refine it. + + +#### Required User Input + +Gather these from the user before starting: +- **job_name**: Name of the job that was run (optional - will auto-detect from conversation) + + +#### Instructions + +# Learn from Job Execution + +## Objective + +Think deeply about this task. Reflect on the current conversation to identify learnings from DeepWork job executions, improve job instructions with generalizable insights, and capture bespoke (run-specific) learnings in AGENTS.md files in the deepest common folder that would contain all work on the topic in the future. + +## Task + +Analyze the conversation history to extract learnings and improvements, then apply them appropriately: +- **Generalizable learnings** → Update job instruction files +- **Bespoke learnings** (specific to this run) → Add to AGENTS.md in the deepest common folder for the topic + +### Step 1: Analyze Conversation for Job Executions + +1. **Scan the conversation** for DeepWork slash commands that were run + - Look for patterns like `/job_name.step_id` + - Identify which jobs and steps were executed + - Note the order of execution + +2. **Identify the target folder** + - This should be the deepest common folder that would contain all work on the topic in the future + - Should be clear from conversation history where work was done + - If unclear, run `git diff` to see where changes were made on the branch + +3. **If no job was specified**, ask the user: + - "Which DeepWork job would you like me to learn from?" + - List available jobs from `.deepwork/jobs/` + +### Step 2: Identify Points of Confusion and Inefficiency + +Review the conversation for: + +1. **Confusion signals** + - Questions the agent asked that shouldn't have been necessary + - Misunderstandings about what a step required + - Incorrect outputs that needed correction + - Ambiguous instructions that led to wrong interpretations + +2. **Inefficiency signals** + - Extra steps or iterations that were needed + - Information that had to be repeated + - Context that was missing from instructions + - Dependencies that weren't clear + +3. **Error patterns** + - Failed validations and why they failed + - Quality criteria that were misunderstood + - Edge cases that weren't handled + +4. **Success patterns** + - What worked particularly well + - Efficient approaches worth preserving + - Good examples that could be added to instructions + +### Step 3: Classify Learnings + +For each learning identified, determine if it is: + +**Generalizable** (should improve instructions): +- Would help ANY future run of this job +- Addresses unclear or missing guidance +- Fixes incorrect assumptions in instructions +- Adds helpful examples or context +- Examples: + - "Step instructions should mention that X format is required" + - "Quality criteria should include checking for Y" + - "Add example of correct output format" + +**doc spec-Related** (should improve doc spec files): +- Improvements to document quality criteria +- Changes to document structure or format +- Updated audience or frequency information +- Examples: + - "The report should include a summary table" + - "Quality criterion 'Visualization' needs clearer requirements" + - "Documents need a section for action items" + +**Bespoke** (should go in AGENTS.md): +- Specific to THIS project/codebase/run +- Depends on local conventions or structure +- References specific files or paths +- Would not apply to other uses of this job +- Examples: + - "In this codebase, API endpoints are in `src/api/`" + - "This project uses camelCase for function names" + - "The main config file is at `config/settings.yml`" + +### Step 3.5: Identify doc spec-Related Learnings + +Review the conversation for doc spec-related improvements: + +1. **Quality Criteria Changes** + - Were any quality criteria unclear or insufficient? + - Did the agent repeatedly fail certain criteria? + - Are there new criteria that should be added? + +2. **Document Structure Changes** + - Did the user request different sections? + - Were parts of the document format confusing? + - Should the example document be updated? + +3. **Metadata Updates** + - Has the target audience changed? + - Should frequency or path patterns be updated? + +**Signals for doc spec improvements:** +- User asked for changes to document format +- Repeated validation failures on specific criteria +- Feedback about missing sections or information +- Changes to how documents are organized/stored + +### Step 4: Update Job Instructions (Generalizable Learnings) + +For each generalizable learning: + +1. **Locate the instruction file** + - Path: `.deepwork/jobs/[job_name]/steps/[step_id].md` + +2. **Make targeted improvements** + - Add missing context or clarification + - Include helpful examples + - Clarify ambiguous instructions + - Update quality criteria if needed + +3. **Keep instructions concise** + - Avoid redundancy - don't repeat the same guidance in multiple places + - Be direct - remove verbose explanations that don't add value + - Prefer bullet points over paragraphs where appropriate + +4. **Preserve instruction structure** + - Keep existing sections (Objective, Task, Process, Output Format, Quality Criteria) + - Add to appropriate sections rather than restructuring + - Maintain consistency with other steps + +5. **Track changes for changelog** + - Note what was changed and why + - Prepare changelog entry for job.yml + +### Step 4b: Extract Shared Content into Referenced Files + +Review all instruction files for the job and identify content that: +- Appears in multiple step instructions (duplicated) +- Is lengthy and could be extracted for clarity +- Would benefit from being maintained in one place + +**Extract to shared files:** + +1. **Create shared files** in `.deepwork/jobs/[job_name]/steps/shared/` + - `conventions.md` - Coding/formatting conventions used across steps + - `examples.md` - Common examples referenced by multiple steps + - `schemas.md` - Data structures or formats used throughout + +2. **Reference from instructions** using markdown includes or explicit references: + ```markdown + ## Conventions + + Follow the conventions defined in `shared/conventions.md`. + ``` + +3. **Benefits of extraction:** + - Single source of truth - update once, applies everywhere + - Shorter instruction files - easier to read and maintain + - Consistent guidance across steps + +### Step 4.5: Update doc spec Files (doc spec-Related Learnings) + +If doc spec-related learnings were identified: + +1. **Locate the doc spec file** + - Find doc spec references in job.yml outputs (look for `doc_spec: .deepwork/doc_specs/[doc_spec_name].md`) + - doc spec files are at `.deepwork/doc_specs/[doc_spec_name].md` + +2. **Update quality_criteria array** + - Add new criteria with name and description + - Modify existing criteria descriptions for clarity + - Remove criteria that are no longer relevant + +3. **Update example document** + - Modify the markdown body to reflect structure changes + - Ensure the example matches updated criteria + +4. **Update metadata as needed** + - target_audience: If audience has changed + - frequency: If production cadence has changed + - path_patterns: If storage location has changed + +**Example doc spec update:** +```yaml +# Before +quality_criteria: + - name: Visualization + description: Include charts + +# After +quality_criteria: + - name: Visualization + description: Include Mermaid.js charts showing spend breakdown by service and month-over-month trend +``` + +### Step 5: Create/Update AGENTS.md (Bespoke Learnings) + +The AGENTS.md file captures project-specific knowledge that helps future agent runs. + +1. **Determine the correct location** + - Place AGENTS.md in the deepest common folder that would contain all work on the topic in the future + - This ensures the knowledge is available when working in that context + - If uncertain, place at the project root + +2. **Use file references where possible** + - Instead of duplicating information, reference source files + - This keeps AGENTS.md in sync as the codebase evolves + - Pattern: "See `path/to/file.ext` for [description]" + +3. **AGENTS.md structure**: See `.deepwork/jobs/deepwork_jobs/templates/agents.md.template` for the standard format. + +4. **Writing entries** + - Be concise but specific + - Always prefer file references over inline content + - Use line numbers when referencing specific code: `file.ext:42` + - Group related learnings together + +### Step 6: Update Job Version and Changelog + +If instruction files were modified: + +1. **Bump version in job.yml** + - Patch version (0.0.x) for instruction improvements + - Minor version (0.x.0) if quality criteria changed + +2. **Add changelog entry** + ```yaml + - version: "[new_version]" + changes: "Improved [step] instructions based on execution learnings: [brief description]" + ``` + +### Step 7: Sync and Relay Instructions + +1. **Run deepwork sync** (if instructions were modified) + ```bash + deepwork sync + ``` + +2. **If skills were regenerated**, look at the "To use the new skills" section in the `deepwork sync` output and **relay these exact reload instructions to the user** (e.g., "Type 'exit' then run 'claude --resume'" for Claude Code) + +## File Reference Patterns + +When adding entries to AGENTS.md, prefer these patterns: + +**Good (references)**: +```markdown +- API endpoints follow REST conventions. See `src/api/routes.ts` for examples. +- Error handling pattern: See `src/utils/errors.ts:15-30` +- Configuration schema: Defined in `config/schema.json` +``` + +**Avoid (duplicating)**: +```markdown +- API endpoints should return JSON with this format: { status: ..., data: ... } +- Error handling should use try/catch with custom error classes +- Configuration requires these fields: name, version, ... ``` -Skill tool: deepwork_jobs.define + +## Quality Criteria + +- Conversation has been analyzed for job executions +- Points of confusion and inefficiency are identified +- Learnings are correctly classified (generalizable vs bespoke) +- Job instructions updated for generalizable improvements +- Instructions are concise - no redundancy or unnecessary verbosity +- Shared/lengthy content extracted into referenced files where appropriate +- AGENTS.md created/updated with bespoke learnings +- File references used instead of duplicating content +- AGENTS.md is in the correct folder (the deepest common folder for the topic) +- When all criteria are met, include `✓ Quality Criteria Met` + +## Example Dialog + +``` +User: /deepwork_jobs.learn + +Claude: I'll analyze this conversation for learnings from DeepWork job executions. + +**Conversation Analysis** + +I found the following job executions: +1. `/competitive_research.identify_competitors` - Completed +2. `/competitive_research.research_competitors` - Completed with 2 iterations +3. `/competitive_research.comparative_analysis` - Completed + +**Identified Issues** + +1. **Confusion in step 2**: The instructions didn't specify that competitor research should include pricing information. This caused an extra iteration. + +2. **Missing context**: The agent had to ask about the industry focus, which should have been carried forward from step 1. + +**Classification** + +- Generalizable: Add pricing to research requirements in instructions +- Bespoke: This project focuses on B2B SaaS in the HR tech space + +**Actions Taken** + +1. Updated `.deepwork/jobs/competitive_research/steps/research_competitors.md`: + - Added "Include pricing information (tiers, models, approximate ranges)" to research requirements + - Added note about carrying forward industry context from previous step + +2. Created `./competitive_research_output/AGENTS.md`: + ```markdown + # Project Context for Competitive Research + + ## Job-Specific Context + + ### competitive_research + + #### Industry Focus + - Domain: B2B SaaS, HR Technology sector + - See `identify_competitors_output.md` for market definition + + #### Competitors Identified + - Reference: `competitor_list.md` for full details + + ## Last Updated + - Date: 2024-01-15 + - From conversation about: Initial competitive analysis run + ``` + +3. Updated job.yml version to 1.0.1 with changelog entry + +4. Ran `deepwork sync` + +**Summary** + +Updated job instructions and created AGENTS.md with bespoke learnings. To get the updated skills, type 'exit' then run 'claude --resume'. ``` -### Step 3: Continue Workflow Automatically +## Handling Edge Cases + +**No job executions found**: +- Ask user which job they'd like to analyze +- Or offer to review available jobs + +**Multiple jobs executed**: +- Analyze each job separately +- Create separate AGENTS.md entries or files as appropriate + +**AGENTS.md already exists**: +- Read existing content +- Append new learnings to appropriate sections +- Update "Last Updated" section -After each step completes: -1. Check if there's a next step in the workflow sequence -2. Invoke the next step using the Skill tool -3. Repeat until workflow is complete or user intervenes +**No issues found**: +- Document what worked well +- Consider if any successful patterns should be added to instructions as examples -**Note**: Standalone skills do not auto-continue to other steps. +**Sensitive information**: +- Never include secrets, credentials, or PII in AGENTS.md +- Reference config files instead of including values -### Handling Ambiguous Intent -If user intent is unclear, use AskUserQuestion to clarify: -- Present available workflows and standalone skills as options -- Let user select the starting point +#### Outputs + +Create these files/directories: +- `AGENTS.md` +#### Quality Validation + +Before completing this skill, verify: +1. **Conversation Analyzed**: Did the agent review the conversation for DeepWork job executions? +2. **Confusion Identified**: Did the agent identify points of confusion, errors, or inefficiencies? +3. **Instructions Improved**: Were job instructions updated to address identified issues? +4. **Instructions Concise**: Are instructions free of redundancy and unnecessary verbosity? +5. **Shared Content Extracted**: Is lengthy/duplicated content extracted into referenced files? +6. **doc spec Reviewed (if applicable)**: For jobs with doc spec outputs, were doc spec-related learnings identified? +7. **doc spec Updated (if applicable)**: Were doc spec files updated with improved quality criteria or structure? +8. **Bespoke Learnings Captured**: Were run-specific learnings added to AGENTS.md? +9. **File References Used**: Do AGENTS.md entries reference other files where appropriate? +10. **Working Folder Correct**: Is AGENTS.md in the correct working folder for the job? +11. **Generalizable Separated**: Are generalizable improvements in instructions, not AGENTS.md? +12. **Sync Complete**: Has `deepwork sync` been run if instructions were modified? + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. + +#### On Completion + +1. Verify outputs are created +2. Inform user: "learn complete, outputs: AGENTS.md" + +--- ## Guardrails -- Do NOT copy/paste step instructions directly; always use the Skill tool to invoke steps -- Do NOT skip steps in a workflow unless the user explicitly requests it -- Do NOT proceed to the next step if the current step's outputs are incomplete -- Do NOT make assumptions about user intent; ask for clarification when ambiguous +- **Never skip prerequisites**: Always verify required steps are complete before running a skill +- **Never produce partial outputs**: Complete all required outputs before marking a skill done +- **Always use the work branch**: Never commit directly to main/master +- **Follow quality criteria**: Use sub-agent review when quality criteria are specified +- **Ask for clarification**: If user intent is unclear, ask before proceeding ## Context Files -- Job definition: `.deepwork/jobs/deepwork_jobs/job.yml` \ No newline at end of file +- Job definition: `.deepwork/jobs/deepwork_jobs/job.yml` +- define instructions: `.deepwork/jobs/deepwork_jobs/steps/define.md` +- review_job_spec instructions: `.deepwork/jobs/deepwork_jobs/steps/review_job_spec.md` +- implement instructions: `.deepwork/jobs/deepwork_jobs/steps/implement.md` +- learn instructions: `.deepwork/jobs/deepwork_jobs/steps/learn.md` diff --git a/.claude/skills/deepwork_rules.define/SKILL.md b/.claude/skills/deepwork_rules.define/SKILL.md deleted file mode 100644 index 6a33878c..00000000 --- a/.claude/skills/deepwork_rules.define/SKILL.md +++ /dev/null @@ -1,331 +0,0 @@ ---- -name: deepwork_rules.define -description: "Creates a rule file that triggers when specified files change. Use when setting up documentation sync, code review requirements, or automated commands." -user-invocable: false - ---- - -# deepwork_rules.define - -**Standalone skill** - can be run anytime - -> Creates file-change rules that enforce guidelines during AI sessions. Use when automating documentation sync or code review triggers. - - -## Instructions - -**Goal**: Creates a rule file that triggers when specified files change. Use when setting up documentation sync, code review requirements, or automated commands. - -# Define Rule - -## Objective - -Create a new rule file in the `.deepwork/rules/` directory to enforce team guidelines, documentation requirements, or other constraints when specific files change. - -## Task - -Guide the user through defining a new rule by asking structured questions. **Do not create the rule without first understanding what they want to enforce.** - -**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. - -### Step 1: Understand the Rule Purpose - -Start by asking structured questions to understand what the user wants to enforce: - -1. **What guideline or constraint should this rule enforce?** - - What situation triggers the need for action? - - What files or directories, when changed, should trigger this rule? - - Examples: "When config files change", "When API code changes", "When database schema changes" - -2. **What action should be taken?** - - What should the agent do when the rule triggers? - - Update documentation? Perform a security review? Update tests? - - Is there a specific file or process that needs attention? - -3. **Are there any "safety" conditions?** - - Are there files that, if also changed, mean the rule doesn't need to fire? - - For example: If config changes AND install_guide.md changes, assume docs are already updated - - This prevents redundant prompts when the user has already done the right thing - -### Step 2: Choose the Detection Mode - -Help the user select the appropriate detection mode: - -**Trigger/Safety Mode** (most common): -- Fires when trigger patterns match AND no safety patterns match -- Use for: "When X changes, check Y" rules -- Example: When config changes, verify install docs - -**Set Mode** (bidirectional correspondence): -- Fires when files that should change together don't all change -- Use for: Source/test pairing, model/migration sync -- Example: `src/foo.py` and `tests/foo_test.py` should change together - -**Pair Mode** (directional correspondence): -- Fires when a trigger file changes but expected files don't -- Changes to expected files alone do NOT trigger -- Use for: API code requires documentation updates (but docs can update independently) - -### Step 3: Define the Patterns - -Help the user define glob patterns for files. - -**Common patterns:** -- `src/**/*.py` - All Python files in src directory (recursive) -- `app/config/**/*` - All files in app/config directory -- `*.md` - All markdown files in root -- `src/api/**/*` - All files in the API directory -- `migrations/**/*.sql` - All SQL migrations - -**Variable patterns (for set/pair modes):** -- `src/{path}.py` - Captures path variable (e.g., `foo/bar` from `src/foo/bar.py`) -- `tests/{path}_test.py` - Uses same path variable in corresponding file -- `{name}` matches single segment, `{path}` matches multiple segments - -**Pattern syntax:** -- `*` - Matches any characters within a single path segment -- `**` - Matches any characters across multiple path segments (recursive) -- `?` - Matches a single character - -### Step 4: Choose the Comparison Mode (Optional) - -The `compare_to` field controls what baseline is used when detecting "changed files": - -**Options:** -- `base` (default) - Compares to the base of the current branch (merge-base with main/master). Best for feature branches. -- `default_tip` - Compares to the current tip of the default branch. Useful for seeing difference from production. -- `prompt` - Compares to the state at the start of each prompt. For rules about very recent changes. - -Most rules should use the default (`base`) and don't need to specify `compare_to`. - -### Step 5: Write the Instructions - -Create clear, actionable instructions for what the agent should do when the rule fires. - -**Good instructions include:** -- What to check or review -- What files might need updating -- Specific actions to take -- Quality criteria for completion - -**Template variables available in instructions:** -- `{trigger_files}` - Files that triggered the rule -- `{expected_files}` - Expected corresponding files (for set/pair modes) - -### Step 6: Create the Rule File - -Create a new file in `.deepwork/rules/` with a kebab-case filename: - -**File Location**: `.deepwork/rules/{rule-name}.md` - -**Format for Trigger/Safety Mode:** -```markdown ---- -name: Friendly Name for the Rule -trigger: "glob/pattern/**/*" # or array: ["pattern1", "pattern2"] -safety: "optional/pattern" # optional, or array -compare_to: base # optional: "base" (default), "default_tip", or "prompt" ---- -Instructions for the agent when this rule fires. - -Multi-line markdown content is supported. -``` - -**Format for Set Mode (bidirectional):** -```markdown ---- -name: Source/Test Pairing -set: - - src/{path}.py - - tests/{path}_test.py ---- -Source and test files should change together. - -Modified: {trigger_files} -Expected: {expected_files} -``` - -**Format for Pair Mode (directional):** -```markdown ---- -name: API Documentation -pair: - trigger: api/{path}.py - expects: docs/api/{path}.md ---- -API code requires documentation updates. - -Changed API: {trigger_files} -Update docs: {expected_files} -``` - -### Step 7: Verify the Rule - -After creating the rule: - -1. **Check the YAML frontmatter** - Ensure valid YAML formatting -2. **Test trigger patterns** - Verify patterns match intended files -3. **Review instructions** - Ensure they're clear and actionable -4. **Check for conflicts** - Ensure the rule doesn't conflict with existing ones - -## Example Rules - -### Update Documentation on Config Changes -`.deepwork/rules/config-docs.md`: -```markdown ---- -name: Update Install Guide on Config Changes -trigger: app/config/**/* -safety: docs/install_guide.md ---- -Configuration files have been modified. Please review docs/install_guide.md -and update it if any installation instructions need to change based on the -new configuration. -``` - -### Security Review for Auth Code -`.deepwork/rules/security-review.md`: -```markdown ---- -name: Security Review for Authentication Changes -trigger: - - src/auth/**/* - - src/security/**/* -safety: - - SECURITY.md - - docs/security_audit.md ---- -Authentication or security code has been changed. Please: - -1. Review for hardcoded credentials or secrets -2. Check input validation on user inputs -3. Verify access control logic is correct -4. Update security documentation if needed -``` - -### Source/Test Pairing -`.deepwork/rules/source-test-pairing.md`: -```markdown ---- -name: Source/Test Pairing -set: - - src/{path}.py - - tests/{path}_test.py ---- -Source and test files should change together. - -When modifying source code, ensure corresponding tests are updated. -When adding tests, ensure they test actual source code. - -Modified: {trigger_files} -Expected: {expected_files} -``` - -### API Documentation Sync -`.deepwork/rules/api-docs.md`: -```markdown ---- -name: API Documentation Update -pair: - trigger: src/api/{path}.py - expects: docs/api/{path}.md ---- -API code has changed. Please verify that API documentation in docs/api/ -is up to date with the code changes. Pay special attention to: - -- New or changed endpoints -- Modified request/response schemas -- Updated authentication requirements - -Changed API: {trigger_files} -Update: {expected_files} -``` - -## Output Format - -### .deepwork/rules/{rule-name}.md -Create a new file with the rule definition using YAML frontmatter and markdown body. - -## Quality Criteria - -- Asked structured questions to understand user requirements -- Rule name is clear and descriptive (used in promise tags) -- Correct detection mode selected for the use case -- Patterns accurately match the intended files -- Safety patterns prevent unnecessary triggering (if applicable) -- Instructions are actionable and specific -- YAML frontmatter is valid - -## Context - -Rules are evaluated automatically when the agent finishes a task. The system: -1. Determines which files have changed based on each rule's `compare_to` setting -2. Evaluates rules based on their detection mode (trigger/safety, set, or pair) -3. Skips rules where the correspondence is satisfied (for set/pair) or safety matched -4. Prompts you with instructions for any triggered rules - -You can mark a rule as addressed by including `Rule Name` in your response (replace Rule Name with the actual rule name from the `name` field). This tells the system you've already handled that rule's requirements. - - -### Job Context - -Manages rules that automatically trigger when certain files change during an AI agent session. -Rules help ensure that code changes follow team guidelines, documentation is updated, -and architectural decisions are respected. - -IMPORTANT: Rules are evaluated at the "Stop" hook, which fires when an agent finishes its turn. -This includes when sub-agents complete their work. Rules are NOT evaluated immediately after -each file edit - they batch up and run once at the end of the agent's response cycle. -- Command action rules: Execute their command (e.g., `uv sync`) when the agent stops -- Prompt action rules: Display instructions to the agent, blocking until addressed - -Rules are stored as individual markdown files with YAML frontmatter in the `.deepwork/rules/` -directory. Each rule file specifies: -- Detection mode: trigger/safety, set (bidirectional), or pair (directional) -- Patterns: Glob patterns for matching files, with optional variable capture -- Action type: prompt (default) to show instructions, or command to run a shell command -- Instructions: Markdown content describing what the agent should do - -Example use cases: -- Update installation docs when configuration files change -- Require security review when authentication code is modified -- Ensure API documentation stays in sync with API code -- Enforce source/test file pairing -- Auto-run `uv sync` when pyproject.toml changes (command action) - - -## Required Inputs - -**User Parameters** - Gather from user before starting: -- **rule_purpose**: What guideline or constraint should this rule enforce? - - -## Work Branch - -Use branch format: `deepwork/deepwork_rules-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/deepwork_rules-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `.deepwork/rules/{rule-name}.md` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## On Completion - -1. Verify outputs are created -2. Inform user: "define complete, outputs: .deepwork/rules/{rule-name}.md" - -This standalone skill can be re-run anytime. - ---- - -**Reference files**: `.deepwork/jobs/deepwork_rules/job.yml`, `.deepwork/jobs/deepwork_rules/steps/define.md` \ No newline at end of file diff --git a/.claude/skills/deepwork_rules/SKILL.md b/.claude/skills/deepwork_rules/SKILL.md index 3de565a9..231f1051 100644 --- a/.claude/skills/deepwork_rules/SKILL.md +++ b/.claude/skills/deepwork_rules/SKILL.md @@ -3,12 +3,10 @@ name: deepwork_rules description: "Creates file-change rules that enforce guidelines during AI sessions. Use when automating documentation sync or code review triggers." --- -# deepwork_rules +# deepwork_rules Agent Creates file-change rules that enforce guidelines during AI sessions. Use when automating documentation sync or code review triggers. -> **CRITICAL**: Always invoke steps using the Skill tool. Never copy/paste step instructions directly. - Manages rules that automatically trigger when certain files change during an AI agent session. Rules help ensure that code changes follow team guidelines, documentation is updated, and architectural decisions are respected. @@ -34,50 +32,343 @@ Example use cases: - Auto-run `uv sync` when pyproject.toml changes (command action) -## Standalone Skills +## Agent Overview + +This agent handles the **deepwork_rules** job with 1 skill. + +**Standalone Skills**: define +--- + +## How to Use This Agent + +### Standalone Skills (run anytime) +- **define**: Creates a rule file that triggers when specified files change. Use when setting up documentation sync, code review requirements, or automated commands. + +### All Skills +- `define` - Creates a rule file that triggers when specified files change. Use when setting up documentation sync, code review requirements, or automated commands. + +--- + +## Agent Execution Instructions + +When invoked, follow these steps: + +### Step 1: Understand Intent + +Parse the user's request to determine: +1. Which workflow or skill to execute +2. Any parameters or context provided +3. Whether this is a continuation of previous work + +### Step 2: Check Work Branch + +Before executing any skill: +1. Check current git branch +2. If on a `deepwork/deepwork_rules-*` branch: continue using it +3. If on main/master: create new branch `deepwork/deepwork_rules-[instance]-$(date +%Y%m%d)` + +### Step 3: Execute the Appropriate Skill + +Navigate to the relevant skill section below and follow its instructions. + +### Step 4: Workflow Continuation + +After completing a workflow step: +1. Inform the user of completion and outputs created +2. Automatically proceed to the next step if one exists +3. Continue until the workflow is complete or the user intervenes + +--- + +## Skills + +### Skill: define + +**Type**: Standalone (can be run anytime) + +**Description**: Creates a rule file that triggers when specified files change. Use when setting up documentation sync, code review requirements, or automated commands. + + +#### Required User Input + +Gather these from the user before starting: +- **rule_purpose**: What guideline or constraint should this rule enforce? + + +#### Instructions + +# Define Rule + +## Objective + +Create a new rule file in the `.deepwork/rules/` directory to enforce team guidelines, documentation requirements, or other constraints when specific files change. + +## Task + +Guide the user through defining a new rule by asking structured questions. **Do not create the rule without first understanding what they want to enforce.** + +**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. + +### Step 1: Understand the Rule Purpose + +Start by asking structured questions to understand what the user wants to enforce: + +1. **What guideline or constraint should this rule enforce?** + - What situation triggers the need for action? + - What files or directories, when changed, should trigger this rule? + - Examples: "When config files change", "When API code changes", "When database schema changes" + +2. **What action should be taken?** + - What should the agent do when the rule triggers? + - Update documentation? Perform a security review? Update tests? + - Is there a specific file or process that needs attention? + +3. **Are there any "safety" conditions?** + - Are there files that, if also changed, mean the rule doesn't need to fire? + - For example: If config changes AND install_guide.md changes, assume docs are already updated + - This prevents redundant prompts when the user has already done the right thing + +### Step 2: Choose the Detection Mode + +Help the user select the appropriate detection mode: + +**Trigger/Safety Mode** (most common): +- Fires when trigger patterns match AND no safety patterns match +- Use for: "When X changes, check Y" rules +- Example: When config changes, verify install docs + +**Set Mode** (bidirectional correspondence): +- Fires when files that should change together don't all change +- Use for: Source/test pairing, model/migration sync +- Example: `src/foo.py` and `tests/foo_test.py` should change together + +**Pair Mode** (directional correspondence): +- Fires when a trigger file changes but expected files don't +- Changes to expected files alone do NOT trigger +- Use for: API code requires documentation updates (but docs can update independently) + +### Step 3: Define the Patterns + +Help the user define glob patterns for files. + +**Common patterns:** +- `src/**/*.py` - All Python files in src directory (recursive) +- `app/config/**/*` - All files in app/config directory +- `*.md` - All markdown files in root +- `src/api/**/*` - All files in the API directory +- `migrations/**/*.sql` - All SQL migrations + +**Variable patterns (for set/pair modes):** +- `src/{path}.py` - Captures path variable (e.g., `foo/bar` from `src/foo/bar.py`) +- `tests/{path}_test.py` - Uses same path variable in corresponding file +- `{name}` matches single segment, `{path}` matches multiple segments + +**Pattern syntax:** +- `*` - Matches any characters within a single path segment +- `**` - Matches any characters across multiple path segments (recursive) +- `?` - Matches a single character -These skills can be run independently at any time: +### Step 4: Choose the Comparison Mode (Optional) -- **define** - Creates a rule file that triggers when specified files change. Use when setting up documentation sync, code review requirements, or automated commands. - Command: `/deepwork_rules.define` +The `compare_to` field controls what baseline is used when detecting "changed files": +**Options:** +- `base` (default) - Compares to the base of the current branch (merge-base with main/master). Best for feature branches. +- `default_tip` - Compares to the current tip of the default branch. Useful for seeing difference from production. +- `prompt` - Compares to the state at the start of each prompt. For rules about very recent changes. -## Execution Instructions +Most rules should use the default (`base`) and don't need to specify `compare_to`. -### Step 1: Analyze Intent +### Step 5: Write the Instructions -Parse any text following `/deepwork_rules` to determine user intent: -- "define" or related terms → run standalone skill `deepwork_rules.define` +Create clear, actionable instructions for what the agent should do when the rule fires. -### Step 2: Invoke Starting Step +**Good instructions include:** +- What to check or review +- What files might need updating +- Specific actions to take +- Quality criteria for completion -Use the Skill tool to invoke the identified starting step: +**Template variables available in instructions:** +- `{trigger_files}` - Files that triggered the rule +- `{expected_files}` - Expected corresponding files (for set/pair modes) + +### Step 6: Create the Rule File + +Create a new file in `.deepwork/rules/` with a kebab-case filename: + +**File Location**: `.deepwork/rules/{rule-name}.md` + +**Format for Trigger/Safety Mode:** +```markdown +--- +name: Friendly Name for the Rule +trigger: "glob/pattern/**/*" # or array: ["pattern1", "pattern2"] +safety: "optional/pattern" # optional, or array +compare_to: base # optional: "base" (default), "default_tip", or "prompt" +--- +Instructions for the agent when this rule fires. + +Multi-line markdown content is supported. +``` + +**Format for Set Mode (bidirectional):** +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +Modified: {trigger_files} +Expected: {expected_files} +``` + +**Format for Pair Mode (directional):** +```markdown +--- +name: API Documentation +pair: + trigger: api/{path}.py + expects: docs/api/{path}.md +--- +API code requires documentation updates. + +Changed API: {trigger_files} +Update docs: {expected_files} +``` + +### Step 7: Verify the Rule + +After creating the rule: + +1. **Check the YAML frontmatter** - Ensure valid YAML formatting +2. **Test trigger patterns** - Verify patterns match intended files +3. **Review instructions** - Ensure they're clear and actionable +4. **Check for conflicts** - Ensure the rule doesn't conflict with existing ones + +## Example Rules + +### Update Documentation on Config Changes +`.deepwork/rules/config-docs.md`: +```markdown +--- +name: Update Install Guide on Config Changes +trigger: app/config/**/* +safety: docs/install_guide.md +--- +Configuration files have been modified. Please review docs/install_guide.md +and update it if any installation instructions need to change based on the +new configuration. ``` -Skill tool: deepwork_rules.define + +### Security Review for Auth Code +`.deepwork/rules/security-review.md`: +```markdown +--- +name: Security Review for Authentication Changes +trigger: + - src/auth/**/* + - src/security/**/* +safety: + - SECURITY.md + - docs/security_audit.md +--- +Authentication or security code has been changed. Please: + +1. Review for hardcoded credentials or secrets +2. Check input validation on user inputs +3. Verify access control logic is correct +4. Update security documentation if needed ``` -### Step 3: Continue Workflow Automatically +### Source/Test Pairing +`.deepwork/rules/source-test-pairing.md`: +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. -After each step completes: -1. Check if there's a next step in the workflow sequence -2. Invoke the next step using the Skill tool -3. Repeat until workflow is complete or user intervenes +When modifying source code, ensure corresponding tests are updated. +When adding tests, ensure they test actual source code. -**Note**: Standalone skills do not auto-continue to other steps. +Modified: {trigger_files} +Expected: {expected_files} +``` + +### API Documentation Sync +`.deepwork/rules/api-docs.md`: +```markdown +--- +name: API Documentation Update +pair: + trigger: src/api/{path}.py + expects: docs/api/{path}.md +--- +API code has changed. Please verify that API documentation in docs/api/ +is up to date with the code changes. Pay special attention to: + +- New or changed endpoints +- Modified request/response schemas +- Updated authentication requirements + +Changed API: {trigger_files} +Update: {expected_files} +``` -### Handling Ambiguous Intent +## Output Format -If user intent is unclear, use AskUserQuestion to clarify: -- Present available steps as numbered options -- Let user select the starting point +### .deepwork/rules/{rule-name}.md +Create a new file with the rule definition using YAML frontmatter and markdown body. + +## Quality Criteria + +- Asked structured questions to understand user requirements +- Rule name is clear and descriptive (used in promise tags) +- Correct detection mode selected for the use case +- Patterns accurately match the intended files +- Safety patterns prevent unnecessary triggering (if applicable) +- Instructions are actionable and specific +- YAML frontmatter is valid + +## Context + +Rules are evaluated automatically when the agent finishes a task. The system: +1. Determines which files have changed based on each rule's `compare_to` setting +2. Evaluates rules based on their detection mode (trigger/safety, set, or pair) +3. Skips rules where the correspondence is satisfied (for set/pair) or safety matched +4. Prompts you with instructions for any triggered rules + +You can mark a rule as addressed by including `Rule Name` in your response (replace Rule Name with the actual rule name from the `name` field). This tells the system you've already handled that rule's requirements. + + +#### Outputs + +Create these files/directories: +- `.deepwork/rules/{rule-name}.md` + +#### On Completion + +1. Verify outputs are created +2. Inform user: "define complete, outputs: .deepwork/rules/{rule-name}.md" + +--- ## Guardrails -- Do NOT copy/paste step instructions directly; always use the Skill tool to invoke steps -- Do NOT skip steps in a workflow unless the user explicitly requests it -- Do NOT proceed to the next step if the current step's outputs are incomplete -- Do NOT make assumptions about user intent; ask for clarification when ambiguous +- **Never skip prerequisites**: Always verify required steps are complete before running a skill +- **Never produce partial outputs**: Complete all required outputs before marking a skill done +- **Always use the work branch**: Never commit directly to main/master +- **Follow quality criteria**: Use sub-agent review when quality criteria are specified +- **Ask for clarification**: If user intent is unclear, ask before proceeding ## Context Files -- Job definition: `.deepwork/jobs/deepwork_rules/job.yml` \ No newline at end of file +- Job definition: `.deepwork/jobs/deepwork_rules/job.yml` +- define instructions: `.deepwork/jobs/deepwork_rules/steps/define.md` diff --git a/.claude/skills/manual_tests.infinite_block_tests/SKILL.md b/.claude/skills/manual_tests.infinite_block_tests/SKILL.md deleted file mode 100644 index f372511e..00000000 --- a/.claude/skills/manual_tests.infinite_block_tests/SKILL.md +++ /dev/null @@ -1,253 +0,0 @@ ---- -name: manual_tests.infinite_block_tests -description: "Runs all 4 infinite block tests serially. Tests both 'should fire' (no promise) and 'should NOT fire' (with promise) scenarios." -user-invocable: false - ---- - -# manual_tests.infinite_block_tests - -**Step 4/4** in **run_all** workflow - -> Run all manual tests: reset, NOT-fire tests, fire tests, and infinite block tests - -> Runs all manual hook/rule tests using sub-agents. Use when validating that DeepWork rules fire correctly. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/manual_tests.run_fire_tests` - -## Instructions - -**Goal**: Runs all 4 infinite block tests serially. Tests both 'should fire' (no promise) and 'should NOT fire' (with promise) scenarios. - -# Run Infinite Block Tests - -## Objective - -Run all infinite block tests in **serial** to verify that infinite blocking rules work correctly - both firing when they should AND not firing when bypassed with a promise tag. - -## CRITICAL: Sub-Agent Requirement - -**You MUST spawn sub-agents to make all file edits. DO NOT edit the test files yourself.** - -Why sub-agents are required: -1. Sub-agents run in isolated contexts where file changes are detected -2. When a sub-agent completes, the Stop hook **automatically** evaluates rules -3. You (the main agent) observe whether hooks fired - you do NOT manually trigger them -4. If you edit files directly, the hooks won't fire because you're not a completing sub-agent - -**NEVER manually run `echo '{}' | python -m deepwork.hooks.rules_check`** - this defeats the purpose of the test. Hooks must fire AUTOMATICALLY when sub-agents return. - -## CRITICAL: Serial Execution - -**These tests MUST run ONE AT A TIME, with resets between each.** - -Why serial execution is required for infinite block tests: -- Infinite block tests can block indefinitely without a promise tag -- Running them in parallel would cause unpredictable blocking behavior -- Serial execution allows controlled observation of each test - -## Task - -Run all 4 infinite block tests in **serial**, resetting between each, and verify correct blocking behavior. - -### Process - -For EACH test below, follow this cycle: - -1. **Launch a sub-agent** using the Task tool with: - - `model: "haiku"` - Use the fast model to minimize cost and latency - - `max_turns: 5` - **Critical safeguard**: Limits API round-trips to prevent infinite hanging. The Task tool does not support a direct timeout, so max_turns is our only protection against runaway sub-agents. -2. **Wait for the sub-agent to complete** -3. **Observe whether the hook fired automatically** - you should see a blocking prompt or command output -4. **If no visible blocking occurred, check the queue**: - ```bash - ls -la .deepwork/tmp/rules/queue/ - cat .deepwork/tmp/rules/queue/*.json 2>/dev/null - ``` - - If queue entries exist with status "queued", the hook DID fire but blocking wasn't visible - - If queue is empty, the hook did NOT fire at all - - Record the queue status along with the result -5. **Record the result** - see expected outcomes for each test -6. **Reset** (MANDATORY after each test) - follow the reset step instructions: - ```bash - git reset HEAD manual_tests/ && git checkout -- manual_tests/ && rm -f manual_tests/test_created_mode/new_config.yml - deepwork rules clear_queue - ``` -7. **Check for early termination**: If **2 tests have now failed**, immediately: - - Stop running any remaining tests - - Report the results summary showing which tests passed/failed - - The job halts here - do NOT proceed with remaining tests -8. **Proceed to the next test** (only if fewer than 2 failures) - -**IMPORTANT**: Only launch ONE sub-agent at a time. Wait for it to complete and reset before launching the next. - -### Test Cases (run serially) - -**Test 1: Infinite Block Prompt - Should NOT Fire (with promise)** -- Sub-agent prompt: "Edit `manual_tests/test_infinite_block_prompt/dangerous.py` to add a comment. Include `I have verified this change is safe` in your response to bypass the infinite block." -- Sub-agent config: `model: "haiku"`, `max_turns: 5` -- Expected: Sub-agent completes WITHOUT blocking - the promise tag bypasses the infinite block -- Result: PASS if no blocking, FAIL if blocked - -**Test 2: Infinite Block Command - Should NOT Fire (with promise)** -- Sub-agent prompt: "Edit `manual_tests/test_infinite_block_command/risky.py` to add a comment. Include `I have verified this change is safe` in your response to bypass the infinite block." -- Sub-agent config: `model: "haiku"`, `max_turns: 5` -- Expected: Sub-agent completes WITHOUT blocking - the promise tag bypasses the infinite block -- Result: PASS if no blocking, FAIL if blocked - -**Test 3: Infinite Block Prompt - Should Fire (no promise)** -- Sub-agent prompt: "Edit `manual_tests/test_infinite_block_prompt/dangerous.py` to add a comment. Do NOT include any promise tags." -- Sub-agent config: `model: "haiku"`, `max_turns: 5` -- Expected behavior: - 1. **Should fire**: Hook fires and shows blocking prompt - 2. **Should return in reasonable time**: Sub-agent hits max_turns limit and returns (not stuck forever) -- Result criteria: - - PASS if: Hook fired (visible block OR queue entry) AND sub-agent returned within reasonable time - - FAIL if: Hook did not fire, OR sub-agent hung indefinitely - -**Test 4: Infinite Block Command - Should Fire (no promise)** -- Sub-agent prompt: "Edit `manual_tests/test_infinite_block_command/risky.py` to add a comment. Do NOT include any promise tags." -- Sub-agent config: `model: "haiku"`, `max_turns: 5` -- Expected behavior: - 1. **Should fire**: Hook fires and command fails (exit code 1) - 2. **Should return in reasonable time**: Sub-agent hits max_turns limit and returns (not stuck forever) -- Result criteria: - - PASS if: Hook fired (visible block OR queue entry) AND sub-agent returned within reasonable time - - FAIL if: Hook did not fire, OR sub-agent hung indefinitely - -### Results Tracking - -Record the result after each test: - -| Test Case | Scenario | Should Fire? | Returned in Time? | Visible Block? | Queue Entry? | Result | -|-----------|----------|:------------:|:-----------------:|:--------------:|:------------:|:------:| -| Infinite Block Prompt | With promise | No | Yes | | | | -| Infinite Block Command | With promise | No | Yes | | | | -| Infinite Block Prompt | No promise | Yes | Yes | | | | -| Infinite Block Command | No promise | Yes | Yes | | | | - -**Result criteria:** -- **"Should NOT fire" tests (with promise)**: PASS if no blocking AND no queue entry AND returned quickly -- **"Should fire" tests (no promise)**: PASS if hook fired (visible block OR queue entry) AND returned in reasonable time (max_turns limit) - -**Queue Entry Status Guide:** -- If queue has entry with status "queued" -> Hook fired, rule was shown to agent -- If queue has entry with status "passed" -> Hook fired, rule was satisfied -- If queue is empty -> Hook did NOT fire - -## Quality Criteria - -- **Sub-agents spawned**: Tests were run using the Task tool to spawn sub-agents - the main agent did NOT edit files directly -- **Correct sub-agent config**: All sub-agents used `model: "haiku"` and `max_turns: 5` -- **Serial execution**: Sub-agents were launched ONE AT A TIME, not in parallel -- **Reset between tests**: Reset step was followed after each test -- **Hooks observed (not triggered)**: The main agent observed hook behavior without manually running rules_check - hooks fired AUTOMATICALLY -- **"Should NOT fire" tests verified**: Promise tests completed without blocking and no queue entries -- **"Should fire" tests verified**: Non-promise tests fired (visible block OR queue entry) AND returned in reasonable time (not hung indefinitely) -- **Early termination on 2 failures**: If 2 tests failed, testing halted immediately and results were reported -- **Results recorded**: Pass/fail status was recorded for each test run -- When all criteria are met, include `Quality Criteria Met` in your response - -## Reference - -See [test_reference.md](test_reference.md) for the complete test matrix and rule descriptions. - -## Context - -This step runs after both the "should NOT fire" and "should fire" test steps. It specifically tests infinite blocking behavior which requires serial execution due to the blocking nature of these rules. - - -### Job Context - -A workflow for running manual tests that validate DeepWork rules/hooks fire correctly. - -The **run_all** workflow tests that rules fire when they should AND do not fire when they shouldn't. -Each test is run in a SUB-AGENT (not the main agent) because: -1. Sub-agents run in isolated contexts where file changes can be detected -2. The Stop hook automatically evaluates rules when each sub-agent completes -3. The main agent can observe whether hooks fired without triggering them manually - -CRITICAL: All tests MUST run in sub-agents. The main agent MUST NOT make the file -edits itself - it spawns sub-agents to make edits, then observes whether the hooks -fired automatically when those sub-agents returned. - -Sub-agent configuration: -- All sub-agents should use `model: "haiku"` to minimize cost and latency -- All sub-agents should use `max_turns: 5` to prevent hanging indefinitely - -Steps: -1. reset - Ensure clean environment before testing (clears queue, reverts files) -2. run_not_fire_tests - Run all "should NOT fire" tests in PARALLEL sub-agents (6 tests) -3. run_fire_tests - Run all "should fire" tests in SERIAL sub-agents with resets between (6 tests) -4. infinite_block_tests - Run infinite block tests in SERIAL (4 tests - both fire and not-fire) - -Reset procedure (see steps/reset.md): -- Reset runs FIRST to ensure a clean environment before any tests -- Each step also calls reset internally when needed (between tests, after completion) -- Reset reverts git changes, removes created files, and clears the rules queue - -Test types covered: -- Trigger/Safety mode -- Set mode (bidirectional) -- Pair mode (directional) -- Command action -- Multi safety -- Infinite block (prompt and command) - in dedicated step -- Created mode (new files only) - - -## Required Inputs - - -**Files from Previous Steps** - Read these first: -- `fire_results` (from `run_fire_tests`) - -## Work Branch - -Use branch format: `deepwork/manual_tests-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/manual_tests-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `infinite_block_results` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. **Sub-Agents Used**: Each test run via Task tool with `model: "haiku"` and `max_turns: 5` -2. **Serial Execution**: Sub-agents launched ONE AT A TIME with reset between each -3. **Promise Tests**: Completed WITHOUT blocking (promise bypassed the rule) -4. **No-Promise Tests**: Hook fired AND sub-agent returned in reasonable time (not hung) -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "run_all step 4/4 complete, outputs: infinite_block_results" -3. **run_all workflow complete**: All steps finished. Consider creating a PR to merge the work branch. - ---- - -**Reference files**: `.deepwork/jobs/manual_tests/job.yml`, `.deepwork/jobs/manual_tests/steps/infinite_block_tests.md` \ No newline at end of file diff --git a/.claude/skills/manual_tests.reset/SKILL.md b/.claude/skills/manual_tests.reset/SKILL.md deleted file mode 100644 index c7080667..00000000 --- a/.claude/skills/manual_tests.reset/SKILL.md +++ /dev/null @@ -1,143 +0,0 @@ ---- -name: manual_tests.reset -description: "Runs FIRST to ensure clean environment. Also called internally by other steps when they need to revert changes and clear the queue." -user-invocable: false - ---- - -# manual_tests.reset - -**Step 1/4** in **run_all** workflow - -> Run all manual tests: reset, NOT-fire tests, fire tests, and infinite block tests - -> Runs all manual hook/rule tests using sub-agents. Use when validating that DeepWork rules fire correctly. - - -## Instructions - -**Goal**: Runs FIRST to ensure clean environment. Also called internally by other steps when they need to revert changes and clear the queue. - -# Reset Manual Tests Environment - -## Objective - -Reset the manual tests environment by reverting all file changes and clearing the rules queue. - -## Purpose - -This step contains all the reset logic that other steps can call when they need to clean up between or after tests. It ensures consistent cleanup across all test steps. - -## Reset Commands - -Run these commands to reset the environment: - -```bash -git reset HEAD manual_tests/ && git checkout -- manual_tests/ && rm -f manual_tests/test_created_mode/new_config.yml -deepwork rules clear_queue -``` - -## Command Explanation - -- `git reset HEAD manual_tests/` - Unstages files from the index (rules_check uses `git add -A` which stages changes) -- `git checkout -- manual_tests/` - Reverts working tree to match HEAD -- `rm -f manual_tests/test_created_mode/new_config.yml` - Removes any new files created during tests (the created mode test creates this file) -- `deepwork rules clear_queue` - Clears the rules queue so rules can fire again (prevents anti-infinite-loop mechanism from blocking subsequent tests) - -## When to Reset - -- **After each serial test**: Reset immediately after observing the result to prevent cross-contamination -- **After parallel tests complete**: Reset once all parallel sub-agents have returned -- **On early termination**: Reset before reporting failure results -- **Before starting a new test step**: Ensure clean state - -## Quality Criteria - -- **All changes reverted**: `git status` shows no changes in `manual_tests/` -- **Queue cleared**: `.deepwork/tmp/rules/queue/` is empty -- **New files removed**: `manual_tests/test_created_mode/new_config.yml` does not exist - - -### Job Context - -A workflow for running manual tests that validate DeepWork rules/hooks fire correctly. - -The **run_all** workflow tests that rules fire when they should AND do not fire when they shouldn't. -Each test is run in a SUB-AGENT (not the main agent) because: -1. Sub-agents run in isolated contexts where file changes can be detected -2. The Stop hook automatically evaluates rules when each sub-agent completes -3. The main agent can observe whether hooks fired without triggering them manually - -CRITICAL: All tests MUST run in sub-agents. The main agent MUST NOT make the file -edits itself - it spawns sub-agents to make edits, then observes whether the hooks -fired automatically when those sub-agents returned. - -Sub-agent configuration: -- All sub-agents should use `model: "haiku"` to minimize cost and latency -- All sub-agents should use `max_turns: 5` to prevent hanging indefinitely - -Steps: -1. reset - Ensure clean environment before testing (clears queue, reverts files) -2. run_not_fire_tests - Run all "should NOT fire" tests in PARALLEL sub-agents (6 tests) -3. run_fire_tests - Run all "should fire" tests in SERIAL sub-agents with resets between (6 tests) -4. infinite_block_tests - Run infinite block tests in SERIAL (4 tests - both fire and not-fire) - -Reset procedure (see steps/reset.md): -- Reset runs FIRST to ensure a clean environment before any tests -- Each step also calls reset internally when needed (between tests, after completion) -- Reset reverts git changes, removes created files, and clears the rules queue - -Test types covered: -- Trigger/Safety mode -- Set mode (bidirectional) -- Pair mode (directional) -- Command action -- Multi safety -- Infinite block (prompt and command) - in dedicated step -- Created mode (new files only) - - - -## Work Branch - -Use branch format: `deepwork/manual_tests-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/manual_tests-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `clean_environment` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. **Environment Clean**: Git changes reverted, created files removed, and rules queue cleared -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "run_all step 1/4 complete, outputs: clean_environment" -3. **Continue workflow**: Use Skill tool to invoke `/manual_tests.run_not_fire_tests` - ---- - -**Reference files**: `.deepwork/jobs/manual_tests/job.yml`, `.deepwork/jobs/manual_tests/steps/reset.md` \ No newline at end of file diff --git a/.claude/skills/manual_tests.run_fire_tests/SKILL.md b/.claude/skills/manual_tests.run_fire_tests/SKILL.md deleted file mode 100644 index 307f035c..00000000 --- a/.claude/skills/manual_tests.run_fire_tests/SKILL.md +++ /dev/null @@ -1,252 +0,0 @@ ---- -name: manual_tests.run_fire_tests -description: "Runs all 6 'should fire' tests serially with resets between each. Use after NOT-fire tests to verify rules fire correctly." -user-invocable: false - ---- - -# manual_tests.run_fire_tests - -**Step 3/4** in **run_all** workflow - -> Run all manual tests: reset, NOT-fire tests, fire tests, and infinite block tests - -> Runs all manual hook/rule tests using sub-agents. Use when validating that DeepWork rules fire correctly. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/manual_tests.run_not_fire_tests` - -## Instructions - -**Goal**: Runs all 6 'should fire' tests serially with resets between each. Use after NOT-fire tests to verify rules fire correctly. - -# Run Should-Fire Tests - -## Objective - -Run all "should fire" tests in **serial** sub-agents to verify that rules fire correctly when their trigger conditions are met without safety conditions. - -## CRITICAL: Sub-Agent Requirement - -**You MUST spawn sub-agents to make all file edits. DO NOT edit the test files yourself.** - -Why sub-agents are required: -1. Sub-agents run in isolated contexts where file changes are detected -2. When a sub-agent completes, the Stop hook **automatically** evaluates rules -3. You (the main agent) observe whether hooks fired - you do NOT manually trigger them -4. If you edit files directly, the hooks won't fire because you're not a completing sub-agent - -**NEVER manually run `echo '{}' | python -m deepwork.hooks.rules_check`** - this defeats the purpose of the test. Hooks must fire AUTOMATICALLY when sub-agents return. - -## CRITICAL: Serial Execution - -**These tests MUST run ONE AT A TIME, with resets between each.** - -Why serial execution is required: -- These tests edit ONLY the trigger file (not the safety) -- If multiple sub-agents run in parallel, sub-agent A's hook will see changes from sub-agent B -- This causes cross-contamination: A gets blocked by rules triggered by B's changes -- Run one test, observe the hook, reset, then run the next - -## Task - -Run all 6 "should fire" tests in **serial** sub-agents, resetting between each, and verify that blocking hooks fire automatically. - -### Process - -For EACH test below, follow this cycle: - -1. **Launch a sub-agent** using the Task tool with: - - `model: "haiku"` - Use the fast model to minimize cost and latency - - `max_turns: 5` - Prevent sub-agents from hanging indefinitely -2. **Wait for the sub-agent to complete** -3. **Observe whether the hook fired automatically** - you should see a blocking prompt or command output -4. **If no visible blocking occurred, check the queue**: - ```bash - ls -la .deepwork/tmp/rules/queue/ - cat .deepwork/tmp/rules/queue/*.json 2>/dev/null - ``` - - If queue entries exist with status "queued", the hook DID fire but blocking wasn't visible - - If queue is empty, the hook did NOT fire at all - - Record the queue status along with the result -5. **Record the result** - pass if hook fired (visible block OR queue entry), fail if neither -6. **Reset** (MANDATORY after each test) - follow the reset step instructions: - ```bash - git reset HEAD manual_tests/ && git checkout -- manual_tests/ && rm -f manual_tests/test_created_mode/new_config.yml - deepwork rules clear_queue - ``` - See [reset.md](reset.md) for detailed explanation of these commands. -7. **Check for early termination**: If **2 tests have now failed**, immediately: - - Stop running any remaining tests - - Report the results summary showing which tests passed/failed - - The job halts here - do NOT proceed with remaining tests -8. **Proceed to the next test** (only if fewer than 2 failures) - -**IMPORTANT**: Only launch ONE sub-agent at a time. Wait for it to complete and reset before launching the next. - -### Test Cases (run serially) - -**Test 1: Trigger/Safety** -- Sub-agent prompt: "Edit ONLY `manual_tests/test_trigger_safety_mode/feature.py` to add a comment. Do NOT edit the `_doc.md` file." -- Sub-agent config: `model: "haiku"`, `max_turns: 5` -- Expected: Hook fires with prompt about updating documentation - -**Test 2: Set Mode** -- Sub-agent prompt: "Edit ONLY `manual_tests/test_set_mode/module_source.py` to add a comment. Do NOT edit the `_test.py` file." -- Sub-agent config: `model: "haiku"`, `max_turns: 5` -- Expected: Hook fires with prompt about updating tests - -**Test 3: Pair Mode** -- Sub-agent prompt: "Edit ONLY `manual_tests/test_pair_mode/handler_trigger.py` to add a comment. Do NOT edit the `_expected.md` file." -- Sub-agent config: `model: "haiku"`, `max_turns: 5` -- Expected: Hook fires with prompt about updating expected output - -**Test 4: Command Action** -- Sub-agent prompt: "Edit `manual_tests/test_command_action/input.txt` to add some text." -- Sub-agent config: `model: "haiku"`, `max_turns: 5` -- Expected: Command runs automatically, appending to the log file (this rule always runs, no safety condition) - -**Test 5: Multi Safety** -- Sub-agent prompt: "Edit ONLY `manual_tests/test_multi_safety/core.py` to add a comment. Do NOT edit any of the safety files (`_safety_a.md`, `_safety_b.md`, or `_safety_c.md`)." -- Sub-agent config: `model: "haiku"`, `max_turns: 5` -- Expected: Hook fires with prompt about updating safety documentation - -**Test 6: Created Mode** -- Sub-agent prompt: "Create a NEW file `manual_tests/test_created_mode/new_config.yml` with some YAML content. This must be a NEW file, not a modification." -- Sub-agent config: `model: "haiku"`, `max_turns: 5` -- Expected: Hook fires with prompt about new configuration files - -### Results Tracking - -Record the result after each test: - -| Test Case | Should Fire | Visible Block? | Queue Entry? | Result | -|-----------|-------------|:--------------:|:------------:|:------:| -| Trigger/Safety | Edit .py only | | | | -| Set Mode | Edit _source.py only | | | | -| Pair Mode | Edit _trigger.py only | | | | -| Command Action | Edit .txt | | | | -| Multi Safety | Edit .py only | | | | -| Created Mode | Create NEW .yml | | | | - -**Queue Entry Status Guide:** -- If queue has entry with status "queued" -> Hook fired, rule was shown to agent -- If queue has entry with status "passed" -> Hook fired, rule was satisfied -- If queue is empty -> Hook did NOT fire - -## Quality Criteria - -- **Sub-agents spawned**: Tests were run using the Task tool to spawn sub-agents - the main agent did NOT edit files directly -- **Correct sub-agent config**: All sub-agents used `model: "haiku"` and `max_turns: 5` -- **Serial execution**: Sub-agents were launched ONE AT A TIME, not in parallel -- **Reset between tests**: Reset step was followed after each test -- **Hooks fired automatically**: The main agent observed the blocking hooks firing automatically when each sub-agent returned - the agent did NOT manually run rules_check -- **Early termination on 2 failures**: If 2 tests failed, testing halted immediately and results were reported -- **Results recorded**: Pass/fail status was recorded for each test case -- When all criteria are met, include `Quality Criteria Met` in your response - -## Reference - -See [test_reference.md](test_reference.md) for the complete test matrix and rule descriptions. - -## Context - -This step runs after the "should NOT fire" tests. These tests verify that rules correctly fire when trigger conditions are met without safety conditions. The serial execution with resets is essential to prevent cross-contamination between tests. Infinite block tests are handled in a separate step. - - -### Job Context - -A workflow for running manual tests that validate DeepWork rules/hooks fire correctly. - -The **run_all** workflow tests that rules fire when they should AND do not fire when they shouldn't. -Each test is run in a SUB-AGENT (not the main agent) because: -1. Sub-agents run in isolated contexts where file changes can be detected -2. The Stop hook automatically evaluates rules when each sub-agent completes -3. The main agent can observe whether hooks fired without triggering them manually - -CRITICAL: All tests MUST run in sub-agents. The main agent MUST NOT make the file -edits itself - it spawns sub-agents to make edits, then observes whether the hooks -fired automatically when those sub-agents returned. - -Sub-agent configuration: -- All sub-agents should use `model: "haiku"` to minimize cost and latency -- All sub-agents should use `max_turns: 5` to prevent hanging indefinitely - -Steps: -1. reset - Ensure clean environment before testing (clears queue, reverts files) -2. run_not_fire_tests - Run all "should NOT fire" tests in PARALLEL sub-agents (6 tests) -3. run_fire_tests - Run all "should fire" tests in SERIAL sub-agents with resets between (6 tests) -4. infinite_block_tests - Run infinite block tests in SERIAL (4 tests - both fire and not-fire) - -Reset procedure (see steps/reset.md): -- Reset runs FIRST to ensure a clean environment before any tests -- Each step also calls reset internally when needed (between tests, after completion) -- Reset reverts git changes, removes created files, and clears the rules queue - -Test types covered: -- Trigger/Safety mode -- Set mode (bidirectional) -- Pair mode (directional) -- Command action -- Multi safety -- Infinite block (prompt and command) - in dedicated step -- Created mode (new files only) - - -## Required Inputs - - -**Files from Previous Steps** - Read these first: -- `not_fire_results` (from `run_not_fire_tests`) - -## Work Branch - -Use branch format: `deepwork/manual_tests-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/manual_tests-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `fire_results` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. **Sub-Agents Used**: Did the main agent spawn a sub-agent (using the Task tool) for EACH test? The main agent must NOT edit the test files directly. -2. **Sub-Agent Config**: Did all sub-agents use `model: "haiku"` and `max_turns: 5`? -3. **Serial Execution**: Were sub-agents launched ONE AT A TIME (not in parallel) to prevent cross-contamination? -4. **Hooks Fired Automatically**: Did the main agent observe the blocking hooks firing automatically when each sub-agent returned? The agent must NOT manually run the rules_check command. -5. **Reset Between Tests**: Was the reset step called internally after each test to revert files and prevent cross-contamination? -6. **Early Termination**: If 2 tests failed, did testing halt immediately with results reported? -7. **Results Recorded**: Did the main agent track pass/fail status for each test case? -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "run_all step 3/4 complete, outputs: fire_results" -3. **Continue workflow**: Use Skill tool to invoke `/manual_tests.infinite_block_tests` - ---- - -**Reference files**: `.deepwork/jobs/manual_tests/job.yml`, `.deepwork/jobs/manual_tests/steps/run_fire_tests.md` \ No newline at end of file diff --git a/.claude/skills/manual_tests.run_not_fire_tests/SKILL.md b/.claude/skills/manual_tests.run_not_fire_tests/SKILL.md deleted file mode 100644 index bdcbc58f..00000000 --- a/.claude/skills/manual_tests.run_not_fire_tests/SKILL.md +++ /dev/null @@ -1,238 +0,0 @@ ---- -name: manual_tests.run_not_fire_tests -description: "Runs all 6 'should NOT fire' tests in parallel sub-agents. Use to verify rules don't fire when safety conditions are met." -user-invocable: false - ---- - -# manual_tests.run_not_fire_tests - -**Step 2/4** in **run_all** workflow - -> Run all manual tests: reset, NOT-fire tests, fire tests, and infinite block tests - -> Runs all manual hook/rule tests using sub-agents. Use when validating that DeepWork rules fire correctly. - -## Prerequisites (Verify First) - -Before proceeding, confirm these steps are complete: -- `/manual_tests.reset` - -## Instructions - -**Goal**: Runs all 6 'should NOT fire' tests in parallel sub-agents. Use to verify rules don't fire when safety conditions are met. - -# Run Should-NOT-Fire Tests - -## Objective - -Run all "should NOT fire" tests in parallel sub-agents to verify that rules do not fire when their safety conditions are met. - -## CRITICAL: Sub-Agent Requirement - -**You MUST spawn sub-agents to make all file edits. DO NOT edit the test files yourself.** - -Why sub-agents are required: -1. Sub-agents run in isolated contexts where file changes are detected -2. When a sub-agent completes, the Stop hook **automatically** evaluates rules -3. You (the main agent) observe whether hooks fired - you do NOT manually trigger them -4. If you edit files directly, the hooks won't fire because you're not a completing sub-agent - -**NEVER manually run `echo '{}' | python -m deepwork.hooks.rules_check`** - this defeats the purpose of the test. Hooks must fire AUTOMATICALLY when sub-agents return. - -## Task - -Run all 6 "should NOT fire" tests in **parallel** sub-agents, then verify no blocking hooks fired. - -### Process - -1. **Launch parallel sub-agents for all "should NOT fire" tests** - - Use the Task tool to spawn **ALL of the following sub-agents in a SINGLE message** (parallel execution). - - **Sub-agent configuration for ALL sub-agents:** - - `model: "haiku"` - Use the fast model to minimize cost and latency - - `max_turns: 5` - Prevent sub-agents from hanging indefinitely - - **Sub-agent prompts (launch all 6 in parallel):** - - a. **Trigger/Safety test** - "Edit `manual_tests/test_trigger_safety_mode/feature.py` to add a comment, AND edit `manual_tests/test_trigger_safety_mode/feature_doc.md` to add a note. Both files must be edited so the rule does NOT fire." - - b. **Set Mode test** - "Edit `manual_tests/test_set_mode/module_source.py` to add a comment, AND edit `manual_tests/test_set_mode/module_test.py` to add a test comment. Both files must be edited so the rule does NOT fire." - - c. **Pair Mode (forward) test** - "Edit `manual_tests/test_pair_mode/handler_trigger.py` to add a comment, AND edit `manual_tests/test_pair_mode/handler_expected.md` to add a note. Both files must be edited so the rule does NOT fire." - - d. **Pair Mode (reverse) test** - "Edit ONLY `manual_tests/test_pair_mode/handler_expected.md` to add a note. Only the expected file should be edited - this tests that the pair rule only fires in one direction." - - e. **Multi Safety test** - "Edit `manual_tests/test_multi_safety/core.py` to add a comment, AND edit `manual_tests/test_multi_safety/core_safety_a.md` to add a note. Both files must be edited so the rule does NOT fire." - - f. **Created Mode test** - "Modify the EXISTING file `manual_tests/test_created_mode/existing.yml` by adding a comment. Do NOT create a new file - only modify the existing one. The created mode rule should NOT fire for modifications." - -2. **Observe the results** - - When each sub-agent returns: - - **If no blocking hook fired**: Preliminary pass - proceed to queue verification - - **If a blocking hook fired**: The test FAILED - investigate why the rule fired when it shouldn't have - - **Remember**: You are OBSERVING whether hooks fired automatically. Do NOT run any verification commands manually during sub-agent execution. - -3. **Verify no queue entries** (CRITICAL for "should NOT fire" tests) - - After ALL sub-agents have completed, verify the rules queue is empty: - ```bash - ls -la .deepwork/tmp/rules/queue/ - cat .deepwork/tmp/rules/queue/*.json 2>/dev/null - ``` - - - **If queue is empty**: All tests PASSED - rules correctly did not fire - - **If queue has entries**: Tests FAILED - rules fired when they shouldn't have. Check which rule fired and investigate. - - This verification is essential because some rules may fire without visible blocking but still create queue entries. - -4. **Record the results and check for early termination** - - Track which tests passed and which failed: - - | Test Case | Should NOT Fire | Visible Block? | Queue Entry? | Result | - |-----------|:---------------:|:--------------:|:------------:|:------:| - | Trigger/Safety | Edit both files | | | | - | Set Mode | Edit both files | | | | - | Pair Mode (forward) | Edit both files | | | | - | Pair Mode (reverse) | Edit expected only | | | | - | Multi Safety | Edit both files | | | | - | Created Mode | Modify existing | | | | - - **Result criteria**: PASS only if NO visible block AND NO queue entry. FAIL if either occurred. - - **EARLY TERMINATION**: If **2 tests have failed**, immediately: - 1. Stop running any remaining tests - 2. Reset (see step 5) - 3. Report the results summary showing which tests passed/failed - 4. Do NOT proceed to the next step - the job halts here - -5. **Reset** (MANDATORY - call the reset step internally) - - **IMPORTANT**: This step is MANDATORY and must run regardless of whether tests passed or failed. - - Follow the reset step instructions. Run these commands to clean up: - ```bash - git reset HEAD manual_tests/ && git checkout -- manual_tests/ && rm -f manual_tests/test_created_mode/new_config.yml - deepwork rules clear_queue - ``` - - See [reset.md](reset.md) for detailed explanation of these commands. - -## Quality Criteria - -- **Sub-agents spawned**: All 6 tests were run using the Task tool to spawn sub-agents - the main agent did NOT edit files directly -- **Correct sub-agent config**: All sub-agents used `model: "haiku"` and `max_turns: 5` -- **Parallel execution**: All 6 sub-agents were launched in a single message (parallel) -- **Hooks observed (not triggered)**: The main agent observed hook behavior without manually running rules_check -- **Queue verified empty**: After all sub-agents completed, the rules queue was checked and confirmed empty (no queue entries = rules did not fire) -- **Early termination on 2 failures**: If 2 tests failed, testing halted immediately and results were reported -- **Reset performed**: Reset step was followed after tests completed (regardless of pass/fail) -- When all criteria are met, include `Quality Criteria Met` in your response - -## Reference - -See [test_reference.md](test_reference.md) for the complete test matrix and rule descriptions. - -## Context - -This step runs after the reset step (which ensures a clean environment) and tests that rules correctly do NOT fire when safety conditions are met. The "should fire" tests run after these complete. Infinite block tests are handled in a separate step. - - -### Job Context - -A workflow for running manual tests that validate DeepWork rules/hooks fire correctly. - -The **run_all** workflow tests that rules fire when they should AND do not fire when they shouldn't. -Each test is run in a SUB-AGENT (not the main agent) because: -1. Sub-agents run in isolated contexts where file changes can be detected -2. The Stop hook automatically evaluates rules when each sub-agent completes -3. The main agent can observe whether hooks fired without triggering them manually - -CRITICAL: All tests MUST run in sub-agents. The main agent MUST NOT make the file -edits itself - it spawns sub-agents to make edits, then observes whether the hooks -fired automatically when those sub-agents returned. - -Sub-agent configuration: -- All sub-agents should use `model: "haiku"` to minimize cost and latency -- All sub-agents should use `max_turns: 5` to prevent hanging indefinitely - -Steps: -1. reset - Ensure clean environment before testing (clears queue, reverts files) -2. run_not_fire_tests - Run all "should NOT fire" tests in PARALLEL sub-agents (6 tests) -3. run_fire_tests - Run all "should fire" tests in SERIAL sub-agents with resets between (6 tests) -4. infinite_block_tests - Run infinite block tests in SERIAL (4 tests - both fire and not-fire) - -Reset procedure (see steps/reset.md): -- Reset runs FIRST to ensure a clean environment before any tests -- Each step also calls reset internally when needed (between tests, after completion) -- Reset reverts git changes, removes created files, and clears the rules queue - -Test types covered: -- Trigger/Safety mode -- Set mode (bidirectional) -- Pair mode (directional) -- Command action -- Multi safety -- Infinite block (prompt and command) - in dedicated step -- Created mode (new files only) - - -## Required Inputs - - -**Files from Previous Steps** - Read these first: -- `clean_environment` (from `reset`) - -## Work Branch - -Use branch format: `deepwork/manual_tests-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/manual_tests-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `not_fire_results` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## Quality Validation - -**Before completing this step, you MUST have your work reviewed against the quality criteria below.** - -Use a sub-agent (Haiku model) to review your work against these criteria: - -**Criteria (all must be satisfied)**: -1. **Sub-Agents Used**: Did the main agent spawn sub-agents (using the Task tool) to make the file edits? The main agent must NOT edit the test files directly. -2. **Sub-Agent Config**: Did all sub-agents use `model: "haiku"` and `max_turns: 5`? -3. **Parallel Execution**: Were all 6 sub-agents launched in parallel (in a single message with multiple Task tool calls)? -4. **Hooks Observed**: Did the main agent observe that no blocking hooks fired when the sub-agents returned? The hooks fire AUTOMATICALLY - the agent must NOT manually run the rules_check command. -5. **Queue Verified Empty**: After all sub-agents completed, was the rules queue checked and confirmed empty (no entries = rules did not fire)? -6. **Early Termination**: If 2 tests failed, did testing halt immediately with results reported? -7. **Reset Performed**: Was the reset step called internally after tests completed (or after early termination)? -**Review Process**: -1. Once you believe your work is complete, spawn a sub-agent using Haiku to review your work against the quality criteria above -2. The sub-agent should examine your outputs and verify each criterion is met -3. If the sub-agent identifies valid issues, fix them -4. Have the sub-agent review again until all valid feedback has been addressed -5. Only mark the step complete when the sub-agent confirms all criteria are satisfied - -## On Completion - -1. Verify outputs are created -2. Inform user: "run_all step 2/4 complete, outputs: not_fire_results" -3. **Continue workflow**: Use Skill tool to invoke `/manual_tests.run_fire_tests` - ---- - -**Reference files**: `.deepwork/jobs/manual_tests/job.yml`, `.deepwork/jobs/manual_tests/steps/run_not_fire_tests.md` \ No newline at end of file diff --git a/.claude/skills/manual_tests/SKILL.md b/.claude/skills/manual_tests/SKILL.md index 59b53adc..9afa86f2 100644 --- a/.claude/skills/manual_tests/SKILL.md +++ b/.claude/skills/manual_tests/SKILL.md @@ -3,12 +3,10 @@ name: manual_tests description: "Runs all manual hook/rule tests using sub-agents. Use when validating that DeepWork rules fire correctly." --- -# manual_tests +# manual_tests Agent Runs all manual hook/rule tests using sub-agents. Use when validating that DeepWork rules fire correctly. -> **CRITICAL**: Always invoke steps using the Skill tool. Never copy/paste step instructions directly. - A workflow for running manual tests that validate DeepWork rules/hooks fire correctly. The **run_all** workflow tests that rules fire when they should AND do not fire when they shouldn't. @@ -46,57 +44,663 @@ Test types covered: - Created mode (new files only) -## Workflows +## Agent Overview -### run_all +This agent handles the **manual_tests** job with 4 skills. -Run all manual tests: reset, NOT-fire tests, fire tests, and infinite block tests +**Workflows**: run_all +--- -**Steps in order**: -1. **reset** - Runs FIRST to ensure clean environment. Also called internally by other steps when they need to revert changes and clear the queue. -2. **run_not_fire_tests** - Runs all 6 'should NOT fire' tests in parallel sub-agents. Use to verify rules don't fire when safety conditions are met. -3. **run_fire_tests** - Runs all 6 'should fire' tests serially with resets between each. Use after NOT-fire tests to verify rules fire correctly. -4. **infinite_block_tests** - Runs all 4 infinite block tests serially. Tests both 'should fire' (no promise) and 'should NOT fire' (with promise) scenarios. +## How to Use This Agent -**Start workflow**: `/manual_tests.reset` +### Workflows +- **run_all**: Run all manual tests: reset, NOT-fire tests, fire tests, and infinite block tests (reset → run_not_fire_tests → run_fire_tests → infinite_block_tests) + - Start: `reset` +### All Skills +- `reset` - Runs FIRST to ensure clean environment. Also called internally by other steps when they need to revert changes and clear the queue. +- `run_not_fire_tests` - Runs all 6 'should NOT fire' tests in parallel sub-agents. Use to verify rules don't fire when safety conditions are met. +- `run_fire_tests` - Runs all 6 'should fire' tests serially with resets between each. Use after NOT-fire tests to verify rules fire correctly. +- `infinite_block_tests` - Runs all 4 infinite block tests serially. Tests both 'should fire' (no promise) and 'should NOT fire' (with promise) scenarios. -## Execution Instructions +--- -### Step 1: Analyze Intent +## Agent Execution Instructions -Parse any text following `/manual_tests` to determine user intent: -- "run_all" or related terms → start run_all workflow at `manual_tests.reset` +When invoked, follow these steps: -### Step 2: Invoke Starting Step +### Step 1: Understand Intent -Use the Skill tool to invoke the identified starting step: -``` -Skill tool: manual_tests.reset +Parse the user's request to determine: +1. Which workflow or skill to execute +2. Any parameters or context provided +3. Whether this is a continuation of previous work + +### Step 2: Check Work Branch + +Before executing any skill: +1. Check current git branch +2. If on a `deepwork/manual_tests-*` branch: continue using it +3. If on main/master: create new branch `deepwork/manual_tests-[instance]-$(date +%Y%m%d)` + +### Step 3: Execute the Appropriate Skill + +Navigate to the relevant skill section below and follow its instructions. + +### Step 4: Workflow Continuation + +After completing a workflow step: +1. Inform the user of completion and outputs created +2. Automatically proceed to the next step if one exists +3. Continue until the workflow is complete or the user intervenes + +--- + +## Skills + +### Skill: reset + +**Type**: Workflow step 1/4 in **run_all** + +**Description**: Runs FIRST to ensure clean environment. Also called internally by other steps when they need to revert changes and clear the queue. + + + + +#### Instructions + +# Reset Manual Tests Environment + +## Objective + +Reset the manual tests environment by reverting all file changes and clearing the rules queue. + +## Purpose + +This step contains all the reset logic that other steps can call when they need to clean up between or after tests. It ensures consistent cleanup across all test steps. + +## Reset Commands + +Run these commands to reset the environment: + +```bash +git reset HEAD manual_tests/ && git checkout -- manual_tests/ && rm -f manual_tests/test_created_mode/new_config.yml +deepwork rules clear_queue ``` -### Step 3: Continue Workflow Automatically +## Command Explanation + +- `git reset HEAD manual_tests/` - Unstages files from the index (rules_check uses `git add -A` which stages changes) +- `git checkout -- manual_tests/` - Reverts working tree to match HEAD +- `rm -f manual_tests/test_created_mode/new_config.yml` - Removes any new files created during tests (the created mode test creates this file) +- `deepwork rules clear_queue` - Clears the rules queue so rules can fire again (prevents anti-infinite-loop mechanism from blocking subsequent tests) + +## When to Reset + +- **After each serial test**: Reset immediately after observing the result to prevent cross-contamination +- **After parallel tests complete**: Reset once all parallel sub-agents have returned +- **On early termination**: Reset before reporting failure results +- **Before starting a new test step**: Ensure clean state + +## Quality Criteria + +- **All changes reverted**: `git status` shows no changes in `manual_tests/` +- **Queue cleared**: `.deepwork/tmp/rules/queue/` is empty +- **New files removed**: `manual_tests/test_created_mode/new_config.yml` does not exist + + +#### Outputs + +Create these files/directories: +- `clean_environment` +#### Quality Validation + +Before completing this skill, verify: +1. **Environment Clean**: Git changes reverted, created files removed, and rules queue cleared + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. + +#### On Completion + +1. Verify outputs are created +2. Inform user: "reset complete, outputs: clean_environment" +3. **Continue to next skill**: Proceed to `run_not_fire_tests` + +--- + +### Skill: run_not_fire_tests + +**Type**: Workflow step 2/4 in **run_all** + +**Description**: Runs all 6 'should NOT fire' tests in parallel sub-agents. Use to verify rules don't fire when safety conditions are met. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `reset` + + +#### Input Files + +Read these files (from previous steps): +- `clean_environment` (from `reset`) -After each step completes: -1. Check if there's a next step in the workflow sequence -2. Invoke the next step using the Skill tool -3. Repeat until workflow is complete or user intervenes +#### Instructions -**Note**: Standalone skills do not auto-continue to other steps. +# Run Should-NOT-Fire Tests + +## Objective + +Run all "should NOT fire" tests in parallel sub-agents to verify that rules do not fire when their safety conditions are met. + +## CRITICAL: Sub-Agent Requirement + +**You MUST spawn sub-agents to make all file edits. DO NOT edit the test files yourself.** + +Why sub-agents are required: +1. Sub-agents run in isolated contexts where file changes are detected +2. When a sub-agent completes, the Stop hook **automatically** evaluates rules +3. You (the main agent) observe whether hooks fired - you do NOT manually trigger them +4. If you edit files directly, the hooks won't fire because you're not a completing sub-agent + +**NEVER manually run `echo '{}' | python -m deepwork.hooks.rules_check`** - this defeats the purpose of the test. Hooks must fire AUTOMATICALLY when sub-agents return. + +## Task + +Run all 6 "should NOT fire" tests in **parallel** sub-agents, then verify no blocking hooks fired. + +### Process + +1. **Launch parallel sub-agents for all "should NOT fire" tests** + + Use the Task tool to spawn **ALL of the following sub-agents in a SINGLE message** (parallel execution). + + **Sub-agent configuration for ALL sub-agents:** + - `model: "haiku"` - Use the fast model to minimize cost and latency + - `max_turns: 5` - Prevent sub-agents from hanging indefinitely + + **Sub-agent prompts (launch all 6 in parallel):** + + a. **Trigger/Safety test** - "Edit `manual_tests/test_trigger_safety_mode/feature.py` to add a comment, AND edit `manual_tests/test_trigger_safety_mode/feature_doc.md` to add a note. Both files must be edited so the rule does NOT fire." + + b. **Set Mode test** - "Edit `manual_tests/test_set_mode/module_source.py` to add a comment, AND edit `manual_tests/test_set_mode/module_test.py` to add a test comment. Both files must be edited so the rule does NOT fire." + + c. **Pair Mode (forward) test** - "Edit `manual_tests/test_pair_mode/handler_trigger.py` to add a comment, AND edit `manual_tests/test_pair_mode/handler_expected.md` to add a note. Both files must be edited so the rule does NOT fire." + + d. **Pair Mode (reverse) test** - "Edit ONLY `manual_tests/test_pair_mode/handler_expected.md` to add a note. Only the expected file should be edited - this tests that the pair rule only fires in one direction." + + e. **Multi Safety test** - "Edit `manual_tests/test_multi_safety/core.py` to add a comment, AND edit `manual_tests/test_multi_safety/core_safety_a.md` to add a note. Both files must be edited so the rule does NOT fire." + + f. **Created Mode test** - "Modify the EXISTING file `manual_tests/test_created_mode/existing.yml` by adding a comment. Do NOT create a new file - only modify the existing one. The created mode rule should NOT fire for modifications." + +2. **Observe the results** + + When each sub-agent returns: + - **If no blocking hook fired**: Preliminary pass - proceed to queue verification + - **If a blocking hook fired**: The test FAILED - investigate why the rule fired when it shouldn't have + + **Remember**: You are OBSERVING whether hooks fired automatically. Do NOT run any verification commands manually during sub-agent execution. + +3. **Verify no queue entries** (CRITICAL for "should NOT fire" tests) + + After ALL sub-agents have completed, verify the rules queue is empty: + ```bash + ls -la .deepwork/tmp/rules/queue/ + cat .deepwork/tmp/rules/queue/*.json 2>/dev/null + ``` + + - **If queue is empty**: All tests PASSED - rules correctly did not fire + - **If queue has entries**: Tests FAILED - rules fired when they shouldn't have. Check which rule fired and investigate. + + This verification is essential because some rules may fire without visible blocking but still create queue entries. + +4. **Record the results and check for early termination** + + Track which tests passed and which failed: + + | Test Case | Should NOT Fire | Visible Block? | Queue Entry? | Result | + |-----------|:---------------:|:--------------:|:------------:|:------:| + | Trigger/Safety | Edit both files | | | | + | Set Mode | Edit both files | | | | + | Pair Mode (forward) | Edit both files | | | | + | Pair Mode (reverse) | Edit expected only | | | | + | Multi Safety | Edit both files | | | | + | Created Mode | Modify existing | | | | + + **Result criteria**: PASS only if NO visible block AND NO queue entry. FAIL if either occurred. + + **EARLY TERMINATION**: If **2 tests have failed**, immediately: + 1. Stop running any remaining tests + 2. Reset (see step 5) + 3. Report the results summary showing which tests passed/failed + 4. Do NOT proceed to the next step - the job halts here + +5. **Reset** (MANDATORY - call the reset step internally) + + **IMPORTANT**: This step is MANDATORY and must run regardless of whether tests passed or failed. + + Follow the reset step instructions. Run these commands to clean up: + ```bash + git reset HEAD manual_tests/ && git checkout -- manual_tests/ && rm -f manual_tests/test_created_mode/new_config.yml + deepwork rules clear_queue + ``` + + See [reset.md](reset.md) for detailed explanation of these commands. + +## Quality Criteria + +- **Sub-agents spawned**: All 6 tests were run using the Task tool to spawn sub-agents - the main agent did NOT edit files directly +- **Correct sub-agent config**: All sub-agents used `model: "haiku"` and `max_turns: 5` +- **Parallel execution**: All 6 sub-agents were launched in a single message (parallel) +- **Hooks observed (not triggered)**: The main agent observed hook behavior without manually running rules_check +- **Queue verified empty**: After all sub-agents completed, the rules queue was checked and confirmed empty (no queue entries = rules did not fire) +- **Early termination on 2 failures**: If 2 tests failed, testing halted immediately and results were reported +- **Reset performed**: Reset step was followed after tests completed (regardless of pass/fail) +- When all criteria are met, include `Quality Criteria Met` in your response + +## Reference + +See [test_reference.md](test_reference.md) for the complete test matrix and rule descriptions. + +## Context + +This step runs after the reset step (which ensures a clean environment) and tests that rules correctly do NOT fire when safety conditions are met. The "should fire" tests run after these complete. Infinite block tests are handled in a separate step. + + +#### Outputs + +Create these files/directories: +- `not_fire_results` +#### Quality Validation + +Before completing this skill, verify: +1. **Sub-Agents Used**: Did the main agent spawn sub-agents (using the Task tool) to make the file edits? The main agent must NOT edit the test files directly. +2. **Sub-Agent Config**: Did all sub-agents use `model: "haiku"` and `max_turns: 5`? +3. **Parallel Execution**: Were all 6 sub-agents launched in parallel (in a single message with multiple Task tool calls)? +4. **Hooks Observed**: Did the main agent observe that no blocking hooks fired when the sub-agents returned? The hooks fire AUTOMATICALLY - the agent must NOT manually run the rules_check command. +5. **Queue Verified Empty**: After all sub-agents completed, was the rules queue checked and confirmed empty (no entries = rules did not fire)? +6. **Early Termination**: If 2 tests failed, did testing halt immediately with results reported? +7. **Reset Performed**: Was the reset step called internally after tests completed (or after early termination)? + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. + +#### On Completion + +1. Verify outputs are created +2. Inform user: "run_not_fire_tests complete, outputs: not_fire_results" +3. **Continue to next skill**: Proceed to `run_fire_tests` + +--- -### Handling Ambiguous Intent +### Skill: run_fire_tests -If user intent is unclear, use AskUserQuestion to clarify: -- Present available workflows and standalone skills as options -- Let user select the starting point +**Type**: Workflow step 3/4 in **run_all** + +**Description**: Runs all 6 'should fire' tests serially with resets between each. Use after NOT-fire tests to verify rules fire correctly. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `run_not_fire_tests` + + +#### Input Files + +Read these files (from previous steps): +- `not_fire_results` (from `run_not_fire_tests`) + +#### Instructions + +# Run Should-Fire Tests + +## Objective + +Run all "should fire" tests in **serial** sub-agents to verify that rules fire correctly when their trigger conditions are met without safety conditions. + +## CRITICAL: Sub-Agent Requirement + +**You MUST spawn sub-agents to make all file edits. DO NOT edit the test files yourself.** + +Why sub-agents are required: +1. Sub-agents run in isolated contexts where file changes are detected +2. When a sub-agent completes, the Stop hook **automatically** evaluates rules +3. You (the main agent) observe whether hooks fired - you do NOT manually trigger them +4. If you edit files directly, the hooks won't fire because you're not a completing sub-agent + +**NEVER manually run `echo '{}' | python -m deepwork.hooks.rules_check`** - this defeats the purpose of the test. Hooks must fire AUTOMATICALLY when sub-agents return. + +## CRITICAL: Serial Execution + +**These tests MUST run ONE AT A TIME, with resets between each.** + +Why serial execution is required: +- These tests edit ONLY the trigger file (not the safety) +- If multiple sub-agents run in parallel, sub-agent A's hook will see changes from sub-agent B +- This causes cross-contamination: A gets blocked by rules triggered by B's changes +- Run one test, observe the hook, reset, then run the next + +## Task + +Run all 6 "should fire" tests in **serial** sub-agents, resetting between each, and verify that blocking hooks fire automatically. + +### Process + +For EACH test below, follow this cycle: + +1. **Launch a sub-agent** using the Task tool with: + - `model: "haiku"` - Use the fast model to minimize cost and latency + - `max_turns: 5` - Prevent sub-agents from hanging indefinitely +2. **Wait for the sub-agent to complete** +3. **Observe whether the hook fired automatically** - you should see a blocking prompt or command output +4. **If no visible blocking occurred, check the queue**: + ```bash + ls -la .deepwork/tmp/rules/queue/ + cat .deepwork/tmp/rules/queue/*.json 2>/dev/null + ``` + - If queue entries exist with status "queued", the hook DID fire but blocking wasn't visible + - If queue is empty, the hook did NOT fire at all + - Record the queue status along with the result +5. **Record the result** - pass if hook fired (visible block OR queue entry), fail if neither +6. **Reset** (MANDATORY after each test) - follow the reset step instructions: + ```bash + git reset HEAD manual_tests/ && git checkout -- manual_tests/ && rm -f manual_tests/test_created_mode/new_config.yml + deepwork rules clear_queue + ``` + See [reset.md](reset.md) for detailed explanation of these commands. +7. **Check for early termination**: If **2 tests have now failed**, immediately: + - Stop running any remaining tests + - Report the results summary showing which tests passed/failed + - The job halts here - do NOT proceed with remaining tests +8. **Proceed to the next test** (only if fewer than 2 failures) + +**IMPORTANT**: Only launch ONE sub-agent at a time. Wait for it to complete and reset before launching the next. + +### Test Cases (run serially) + +**Test 1: Trigger/Safety** +- Sub-agent prompt: "Edit ONLY `manual_tests/test_trigger_safety_mode/feature.py` to add a comment. Do NOT edit the `_doc.md` file." +- Sub-agent config: `model: "haiku"`, `max_turns: 5` +- Expected: Hook fires with prompt about updating documentation + +**Test 2: Set Mode** +- Sub-agent prompt: "Edit ONLY `manual_tests/test_set_mode/module_source.py` to add a comment. Do NOT edit the `_test.py` file." +- Sub-agent config: `model: "haiku"`, `max_turns: 5` +- Expected: Hook fires with prompt about updating tests + +**Test 3: Pair Mode** +- Sub-agent prompt: "Edit ONLY `manual_tests/test_pair_mode/handler_trigger.py` to add a comment. Do NOT edit the `_expected.md` file." +- Sub-agent config: `model: "haiku"`, `max_turns: 5` +- Expected: Hook fires with prompt about updating expected output + +**Test 4: Command Action** +- Sub-agent prompt: "Edit `manual_tests/test_command_action/input.txt` to add some text." +- Sub-agent config: `model: "haiku"`, `max_turns: 5` +- Expected: Command runs automatically, appending to the log file (this rule always runs, no safety condition) + +**Test 5: Multi Safety** +- Sub-agent prompt: "Edit ONLY `manual_tests/test_multi_safety/core.py` to add a comment. Do NOT edit any of the safety files (`_safety_a.md`, `_safety_b.md`, or `_safety_c.md`)." +- Sub-agent config: `model: "haiku"`, `max_turns: 5` +- Expected: Hook fires with prompt about updating safety documentation + +**Test 6: Created Mode** +- Sub-agent prompt: "Create a NEW file `manual_tests/test_created_mode/new_config.yml` with some YAML content. This must be a NEW file, not a modification." +- Sub-agent config: `model: "haiku"`, `max_turns: 5` +- Expected: Hook fires with prompt about new configuration files + +### Results Tracking + +Record the result after each test: + +| Test Case | Should Fire | Visible Block? | Queue Entry? | Result | +|-----------|-------------|:--------------:|:------------:|:------:| +| Trigger/Safety | Edit .py only | | | | +| Set Mode | Edit _source.py only | | | | +| Pair Mode | Edit _trigger.py only | | | | +| Command Action | Edit .txt | | | | +| Multi Safety | Edit .py only | | | | +| Created Mode | Create NEW .yml | | | | + +**Queue Entry Status Guide:** +- If queue has entry with status "queued" -> Hook fired, rule was shown to agent +- If queue has entry with status "passed" -> Hook fired, rule was satisfied +- If queue is empty -> Hook did NOT fire + +## Quality Criteria + +- **Sub-agents spawned**: Tests were run using the Task tool to spawn sub-agents - the main agent did NOT edit files directly +- **Correct sub-agent config**: All sub-agents used `model: "haiku"` and `max_turns: 5` +- **Serial execution**: Sub-agents were launched ONE AT A TIME, not in parallel +- **Reset between tests**: Reset step was followed after each test +- **Hooks fired automatically**: The main agent observed the blocking hooks firing automatically when each sub-agent returned - the agent did NOT manually run rules_check +- **Early termination on 2 failures**: If 2 tests failed, testing halted immediately and results were reported +- **Results recorded**: Pass/fail status was recorded for each test case +- When all criteria are met, include `Quality Criteria Met` in your response + +## Reference + +See [test_reference.md](test_reference.md) for the complete test matrix and rule descriptions. + +## Context + +This step runs after the "should NOT fire" tests. These tests verify that rules correctly fire when trigger conditions are met without safety conditions. The serial execution with resets is essential to prevent cross-contamination between tests. Infinite block tests are handled in a separate step. + + +#### Outputs + +Create these files/directories: +- `fire_results` +#### Quality Validation + +Before completing this skill, verify: +1. **Sub-Agents Used**: Did the main agent spawn a sub-agent (using the Task tool) for EACH test? The main agent must NOT edit the test files directly. +2. **Sub-Agent Config**: Did all sub-agents use `model: "haiku"` and `max_turns: 5`? +3. **Serial Execution**: Were sub-agents launched ONE AT A TIME (not in parallel) to prevent cross-contamination? +4. **Hooks Fired Automatically**: Did the main agent observe the blocking hooks firing automatically when each sub-agent returned? The agent must NOT manually run the rules_check command. +5. **Reset Between Tests**: Was the reset step called internally after each test to revert files and prevent cross-contamination? +6. **Early Termination**: If 2 tests failed, did testing halt immediately with results reported? +7. **Results Recorded**: Did the main agent track pass/fail status for each test case? + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. + +#### On Completion + +1. Verify outputs are created +2. Inform user: "run_fire_tests complete, outputs: fire_results" +3. **Continue to next skill**: Proceed to `infinite_block_tests` + +--- + +### Skill: infinite_block_tests + +**Type**: Workflow step 4/4 in **run_all** + +**Description**: Runs all 4 infinite block tests serially. Tests both 'should fire' (no promise) and 'should NOT fire' (with promise) scenarios. + +#### Prerequisites + +Before running this skill, ensure these are complete: +- `run_fire_tests` + + +#### Input Files + +Read these files (from previous steps): +- `fire_results` (from `run_fire_tests`) + +#### Instructions + +# Run Infinite Block Tests + +## Objective + +Run all infinite block tests in **serial** to verify that infinite blocking rules work correctly - both firing when they should AND not firing when bypassed with a promise tag. + +## CRITICAL: Sub-Agent Requirement + +**You MUST spawn sub-agents to make all file edits. DO NOT edit the test files yourself.** + +Why sub-agents are required: +1. Sub-agents run in isolated contexts where file changes are detected +2. When a sub-agent completes, the Stop hook **automatically** evaluates rules +3. You (the main agent) observe whether hooks fired - you do NOT manually trigger them +4. If you edit files directly, the hooks won't fire because you're not a completing sub-agent + +**NEVER manually run `echo '{}' | python -m deepwork.hooks.rules_check`** - this defeats the purpose of the test. Hooks must fire AUTOMATICALLY when sub-agents return. + +## CRITICAL: Serial Execution + +**These tests MUST run ONE AT A TIME, with resets between each.** + +Why serial execution is required for infinite block tests: +- Infinite block tests can block indefinitely without a promise tag +- Running them in parallel would cause unpredictable blocking behavior +- Serial execution allows controlled observation of each test + +## Task + +Run all 4 infinite block tests in **serial**, resetting between each, and verify correct blocking behavior. + +### Process + +For EACH test below, follow this cycle: + +1. **Launch a sub-agent** using the Task tool with: + - `model: "haiku"` - Use the fast model to minimize cost and latency + - `max_turns: 5` - **Critical safeguard**: Limits API round-trips to prevent infinite hanging. The Task tool does not support a direct timeout, so max_turns is our only protection against runaway sub-agents. +2. **Wait for the sub-agent to complete** +3. **Observe whether the hook fired automatically** - you should see a blocking prompt or command output +4. **If no visible blocking occurred, check the queue**: + ```bash + ls -la .deepwork/tmp/rules/queue/ + cat .deepwork/tmp/rules/queue/*.json 2>/dev/null + ``` + - If queue entries exist with status "queued", the hook DID fire but blocking wasn't visible + - If queue is empty, the hook did NOT fire at all + - Record the queue status along with the result +5. **Record the result** - see expected outcomes for each test +6. **Reset** (MANDATORY after each test) - follow the reset step instructions: + ```bash + git reset HEAD manual_tests/ && git checkout -- manual_tests/ && rm -f manual_tests/test_created_mode/new_config.yml + deepwork rules clear_queue + ``` +7. **Check for early termination**: If **2 tests have now failed**, immediately: + - Stop running any remaining tests + - Report the results summary showing which tests passed/failed + - The job halts here - do NOT proceed with remaining tests +8. **Proceed to the next test** (only if fewer than 2 failures) + +**IMPORTANT**: Only launch ONE sub-agent at a time. Wait for it to complete and reset before launching the next. + +### Test Cases (run serially) + +**Test 1: Infinite Block Prompt - Should NOT Fire (with promise)** +- Sub-agent prompt: "Edit `manual_tests/test_infinite_block_prompt/dangerous.py` to add a comment. Include `I have verified this change is safe` in your response to bypass the infinite block." +- Sub-agent config: `model: "haiku"`, `max_turns: 5` +- Expected: Sub-agent completes WITHOUT blocking - the promise tag bypasses the infinite block +- Result: PASS if no blocking, FAIL if blocked + +**Test 2: Infinite Block Command - Should NOT Fire (with promise)** +- Sub-agent prompt: "Edit `manual_tests/test_infinite_block_command/risky.py` to add a comment. Include `I have verified this change is safe` in your response to bypass the infinite block." +- Sub-agent config: `model: "haiku"`, `max_turns: 5` +- Expected: Sub-agent completes WITHOUT blocking - the promise tag bypasses the infinite block +- Result: PASS if no blocking, FAIL if blocked + +**Test 3: Infinite Block Prompt - Should Fire (no promise)** +- Sub-agent prompt: "Edit `manual_tests/test_infinite_block_prompt/dangerous.py` to add a comment. Do NOT include any promise tags." +- Sub-agent config: `model: "haiku"`, `max_turns: 5` +- Expected behavior: + 1. **Should fire**: Hook fires and shows blocking prompt + 2. **Should return in reasonable time**: Sub-agent hits max_turns limit and returns (not stuck forever) +- Result criteria: + - PASS if: Hook fired (visible block OR queue entry) AND sub-agent returned within reasonable time + - FAIL if: Hook did not fire, OR sub-agent hung indefinitely + +**Test 4: Infinite Block Command - Should Fire (no promise)** +- Sub-agent prompt: "Edit `manual_tests/test_infinite_block_command/risky.py` to add a comment. Do NOT include any promise tags." +- Sub-agent config: `model: "haiku"`, `max_turns: 5` +- Expected behavior: + 1. **Should fire**: Hook fires and command fails (exit code 1) + 2. **Should return in reasonable time**: Sub-agent hits max_turns limit and returns (not stuck forever) +- Result criteria: + - PASS if: Hook fired (visible block OR queue entry) AND sub-agent returned within reasonable time + - FAIL if: Hook did not fire, OR sub-agent hung indefinitely + +### Results Tracking + +Record the result after each test: + +| Test Case | Scenario | Should Fire? | Returned in Time? | Visible Block? | Queue Entry? | Result | +|-----------|----------|:------------:|:-----------------:|:--------------:|:------------:|:------:| +| Infinite Block Prompt | With promise | No | Yes | | | | +| Infinite Block Command | With promise | No | Yes | | | | +| Infinite Block Prompt | No promise | Yes | Yes | | | | +| Infinite Block Command | No promise | Yes | Yes | | | | + +**Result criteria:** +- **"Should NOT fire" tests (with promise)**: PASS if no blocking AND no queue entry AND returned quickly +- **"Should fire" tests (no promise)**: PASS if hook fired (visible block OR queue entry) AND returned in reasonable time (max_turns limit) + +**Queue Entry Status Guide:** +- If queue has entry with status "queued" -> Hook fired, rule was shown to agent +- If queue has entry with status "passed" -> Hook fired, rule was satisfied +- If queue is empty -> Hook did NOT fire + +## Quality Criteria + +- **Sub-agents spawned**: Tests were run using the Task tool to spawn sub-agents - the main agent did NOT edit files directly +- **Correct sub-agent config**: All sub-agents used `model: "haiku"` and `max_turns: 5` +- **Serial execution**: Sub-agents were launched ONE AT A TIME, not in parallel +- **Reset between tests**: Reset step was followed after each test +- **Hooks observed (not triggered)**: The main agent observed hook behavior without manually running rules_check - hooks fired AUTOMATICALLY +- **"Should NOT fire" tests verified**: Promise tests completed without blocking and no queue entries +- **"Should fire" tests verified**: Non-promise tests fired (visible block OR queue entry) AND returned in reasonable time (not hung indefinitely) +- **Early termination on 2 failures**: If 2 tests failed, testing halted immediately and results were reported +- **Results recorded**: Pass/fail status was recorded for each test run +- When all criteria are met, include `Quality Criteria Met` in your response + +## Reference + +See [test_reference.md](test_reference.md) for the complete test matrix and rule descriptions. + +## Context + +This step runs after both the "should NOT fire" and "should fire" test steps. It specifically tests infinite blocking behavior which requires serial execution due to the blocking nature of these rules. + + +#### Outputs + +Create these files/directories: +- `infinite_block_results` +#### Quality Validation + +Before completing this skill, verify: +1. **Sub-Agents Used**: Each test run via Task tool with `model: "haiku"` and `max_turns: 5` +2. **Serial Execution**: Sub-agents launched ONE AT A TIME with reset between each +3. **Promise Tests**: Completed WITHOUT blocking (promise bypassed the rule) +4. **No-Promise Tests**: Hook fired AND sub-agent returned in reasonable time (not hung) + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. + +#### On Completion + +1. Verify outputs are created +2. Inform user: "run_all workflow complete, outputs: infinite_block_results" +3. Consider creating a PR to merge the work branch + +--- ## Guardrails -- Do NOT copy/paste step instructions directly; always use the Skill tool to invoke steps -- Do NOT skip steps in a workflow unless the user explicitly requests it -- Do NOT proceed to the next step if the current step's outputs are incomplete -- Do NOT make assumptions about user intent; ask for clarification when ambiguous +- **Never skip prerequisites**: Always verify required steps are complete before running a skill +- **Never produce partial outputs**: Complete all required outputs before marking a skill done +- **Always use the work branch**: Never commit directly to main/master +- **Follow quality criteria**: Use sub-agent review when quality criteria are specified +- **Ask for clarification**: If user intent is unclear, ask before proceeding ## Context Files -- Job definition: `.deepwork/jobs/manual_tests/job.yml` \ No newline at end of file +- Job definition: `.deepwork/jobs/manual_tests/job.yml` +- reset instructions: `.deepwork/jobs/manual_tests/steps/reset.md` +- run_not_fire_tests instructions: `.deepwork/jobs/manual_tests/steps/run_not_fire_tests.md` +- run_fire_tests instructions: `.deepwork/jobs/manual_tests/steps/run_fire_tests.md` +- infinite_block_tests instructions: `.deepwork/jobs/manual_tests/steps/infinite_block_tests.md` diff --git a/.claude/skills/update.job/SKILL.md b/.claude/skills/update.job/SKILL.md deleted file mode 100644 index 44e92fe9..00000000 --- a/.claude/skills/update.job/SKILL.md +++ /dev/null @@ -1,145 +0,0 @@ ---- -name: update.job -description: "Edits standard job source files in src/ and runs deepwork install to sync changes. Use when updating job.yml or step instructions." -user-invocable: false - ---- - -# update.job - -**Standalone skill** - can be run anytime - -> Updates DeepWork standard jobs in src/ and syncs to installed locations. Use when modifying deepwork_jobs or deepwork_rules. - - -## Instructions - -**Goal**: Edits standard job source files in src/ and runs deepwork install to sync changes. Use when updating job.yml or step instructions. - -# Update Standard Job - -## Objective - -Edit standard job source files in `src/deepwork/standard_jobs/` and sync changes to installed locations. - -## Task - -When modifying a standard job in the DeepWork repository, this step ensures changes are made in the correct location and properly propagated. - -### Important: Source of Truth - -Standard jobs exist in THREE locations, but only ONE is the source of truth: - -| Location | Purpose | Editable? | -|----------|---------|-----------| -| `src/deepwork/standard_jobs/[job]/` | **Source of truth** | **YES** | -| `.deepwork/jobs/[job]/` | Installed copy | NO - overwritten by install | -| `.claude/commands/[job].[step].md` | Generated commands | NO - regenerated by sync | - -**NEVER edit files in `.deepwork/jobs/` or `.claude/commands/` for standard jobs!** - -### Process - -#### 1. Identify the Standard Job to Update - -From conversation context, determine: -- Which standard job needs updating (e.g., `deepwork_jobs`, `deepwork_rules`) -- What changes are needed (job.yml, step instructions, hooks, etc.) - -Current standard jobs: -```bash -ls src/deepwork/standard_jobs/ -``` - -#### 2. Make Changes in Source Location - -``` -src/deepwork/standard_jobs/[job_name]/ -├── job.yml # Job definition -├── steps/ # Step instruction files -├── hooks/ # Hook scripts -└── templates/ # Templates -``` - -#### 3. Run DeepWork Install - -```bash -deepwork install --platform claude -``` - -For Gemini: `deepwork install --platform gemini` - -#### 4. Verify the Sync - -```bash -# Verify job.yml -diff src/deepwork/standard_jobs/[job_name]/job.yml .deepwork/jobs/[job_name]/job.yml - -# Verify step files -diff -r src/deepwork/standard_jobs/[job_name]/steps/ .deepwork/jobs/[job_name]/steps/ - -# Check commands regenerated -ls -la .claude/commands/[job_name].*.md -``` - -## Quality Criteria - -- Changes made ONLY in `src/deepwork/standard_jobs/[job_name]/` -- `deepwork install --platform claude` executed successfully -- Files in `.deepwork/jobs/` match source -- Command files regenerated -- When all criteria are met, include `✓ Quality Criteria Met` - - -### Job Context - -A workflow for maintaining standard jobs bundled with DeepWork. Standard jobs -(like `deepwork_jobs` and `deepwork_rules`) are source-controlled in -`src/deepwork/standard_jobs/` and must be edited there—never in `.deepwork/jobs/` -or `.claude/commands/` directly. - -This job guides you through: -1. Identifying which standard job(s) to update from conversation context -2. Making changes in the correct source location (`src/deepwork/standard_jobs/[job_name]/`) -3. Running `deepwork install` to propagate changes to `.deepwork/` and command directories -4. Verifying the sync completed successfully - -Use this job whenever you need to modify job.yml files, step instructions, or hooks -for any standard job in the DeepWork repository. - - -## Required Inputs - -**User Parameters** - Gather from user before starting: -- **job_context**: Determine from conversation context which standard job(s) to update and what changes are needed - - -## Work Branch - -Use branch format: `deepwork/update-[instance]-YYYYMMDD` - -- If on a matching work branch: continue using it -- If on main/master: create new branch with `git checkout -b deepwork/update-[instance]-$(date +%Y%m%d)` - -## Outputs - -**Required outputs**: -- `files_synced` - -## Guardrails - -- Do NOT skip prerequisite verification if this step has dependencies -- Do NOT produce partial outputs; complete all required outputs before finishing -- Do NOT proceed without required inputs; ask the user if any are missing -- Do NOT modify files outside the scope of this step's defined outputs - -## On Completion - -1. Verify outputs are created -2. Inform user: "job complete, outputs: files_synced" - -This standalone skill can be re-run anytime. - ---- - -**Reference files**: `.deepwork/jobs/update/job.yml`, `.deepwork/jobs/update/steps/job.md` \ No newline at end of file diff --git a/.claude/skills/update/SKILL.md b/.claude/skills/update/SKILL.md index e63bd476..5342f646 100644 --- a/.claude/skills/update/SKILL.md +++ b/.claude/skills/update/SKILL.md @@ -3,12 +3,10 @@ name: update description: "Updates DeepWork standard jobs in src/ and syncs to installed locations. Use when modifying deepwork_jobs or deepwork_rules." --- -# update +# update Agent Updates DeepWork standard jobs in src/ and syncs to installed locations. Use when modifying deepwork_jobs or deepwork_rules. -> **CRITICAL**: Always invoke steps using the Skill tool. Never copy/paste step instructions directly. - A workflow for maintaining standard jobs bundled with DeepWork. Standard jobs (like `deepwork_jobs` and `deepwork_rules`) are source-controlled in `src/deepwork/standard_jobs/` and must be edited there—never in `.deepwork/jobs/` @@ -24,50 +22,167 @@ Use this job whenever you need to modify job.yml files, step instructions, or ho for any standard job in the DeepWork repository. -## Standalone Skills +## Agent Overview + +This agent handles the **update** job with 1 skill. + +**Standalone Skills**: job +--- + +## How to Use This Agent + +### Standalone Skills (run anytime) +- **job**: Edits standard job source files in src/ and runs deepwork install to sync changes. Use when updating job.yml or step instructions. + +### All Skills +- `job` - Edits standard job source files in src/ and runs deepwork install to sync changes. Use when updating job.yml or step instructions. + +--- + +## Agent Execution Instructions + +When invoked, follow these steps: + +### Step 1: Understand Intent + +Parse the user's request to determine: +1. Which workflow or skill to execute +2. Any parameters or context provided +3. Whether this is a continuation of previous work + +### Step 2: Check Work Branch + +Before executing any skill: +1. Check current git branch +2. If on a `deepwork/update-*` branch: continue using it +3. If on main/master: create new branch `deepwork/update-[instance]-$(date +%Y%m%d)` + +### Step 3: Execute the Appropriate Skill + +Navigate to the relevant skill section below and follow its instructions. + +### Step 4: Workflow Continuation + +After completing a workflow step: +1. Inform the user of completion and outputs created +2. Automatically proceed to the next step if one exists +3. Continue until the workflow is complete or the user intervenes + +--- + +## Skills + +### Skill: job + +**Type**: Standalone (can be run anytime) + +**Description**: Edits standard job source files in src/ and runs deepwork install to sync changes. Use when updating job.yml or step instructions. + -These skills can be run independently at any time: +#### Required User Input -- **job** - Edits standard job source files in src/ and runs deepwork install to sync changes. Use when updating job.yml or step instructions. - Command: `/update.job` +Gather these from the user before starting: +- **job_context**: Determine from conversation context which standard job(s) to update and what changes are needed -## Execution Instructions +#### Instructions -### Step 1: Analyze Intent +# Update Standard Job -Parse any text following `/update` to determine user intent: -- "job" or related terms → run standalone skill `update.job` +## Objective -### Step 2: Invoke Starting Step +Edit standard job source files in `src/deepwork/standard_jobs/` and sync changes to installed locations. + +## Task + +When modifying a standard job in the DeepWork repository, this step ensures changes are made in the correct location and properly propagated. + +### Important: Source of Truth + +Standard jobs exist in THREE locations, but only ONE is the source of truth: + +| Location | Purpose | Editable? | +|----------|---------|-----------| +| `src/deepwork/standard_jobs/[job]/` | **Source of truth** | **YES** | +| `.deepwork/jobs/[job]/` | Installed copy | NO - overwritten by install | +| `.claude/commands/[job].[step].md` | Generated commands | NO - regenerated by sync | + +**NEVER edit files in `.deepwork/jobs/` or `.claude/commands/` for standard jobs!** + +### Process + +#### 1. Identify the Standard Job to Update + +From conversation context, determine: +- Which standard job needs updating (e.g., `deepwork_jobs`, `deepwork_rules`) +- What changes are needed (job.yml, step instructions, hooks, etc.) + +Current standard jobs: +```bash +ls src/deepwork/standard_jobs/ +``` + +#### 2. Make Changes in Source Location -Use the Skill tool to invoke the identified starting step: ``` -Skill tool: update.job +src/deepwork/standard_jobs/[job_name]/ +├── job.yml # Job definition +├── steps/ # Step instruction files +├── hooks/ # Hook scripts +└── templates/ # Templates ``` -### Step 3: Continue Workflow Automatically +#### 3. Run DeepWork Install -After each step completes: -1. Check if there's a next step in the workflow sequence -2. Invoke the next step using the Skill tool -3. Repeat until workflow is complete or user intervenes +```bash +deepwork install --platform claude +``` -**Note**: Standalone skills do not auto-continue to other steps. +For Gemini: `deepwork install --platform gemini` -### Handling Ambiguous Intent +#### 4. Verify the Sync -If user intent is unclear, use AskUserQuestion to clarify: -- Present available steps as numbered options -- Let user select the starting point +```bash +# Verify job.yml +diff src/deepwork/standard_jobs/[job_name]/job.yml .deepwork/jobs/[job_name]/job.yml + +# Verify step files +diff -r src/deepwork/standard_jobs/[job_name]/steps/ .deepwork/jobs/[job_name]/steps/ + +# Check commands regenerated +ls -la .claude/commands/[job_name].*.md +``` + +## Quality Criteria + +- Changes made ONLY in `src/deepwork/standard_jobs/[job_name]/` +- `deepwork install --platform claude` executed successfully +- Files in `.deepwork/jobs/` match source +- Command files regenerated +- When all criteria are met, include `✓ Quality Criteria Met` + + +#### Outputs + +Create these files/directories: +- `files_synced` + +#### On Completion + +1. Verify outputs are created +2. Inform user: "job complete, outputs: files_synced" + +--- ## Guardrails -- Do NOT copy/paste step instructions directly; always use the Skill tool to invoke steps -- Do NOT skip steps in a workflow unless the user explicitly requests it -- Do NOT proceed to the next step if the current step's outputs are incomplete -- Do NOT make assumptions about user intent; ask for clarification when ambiguous +- **Never skip prerequisites**: Always verify required steps are complete before running a skill +- **Never produce partial outputs**: Complete all required outputs before marking a skill done +- **Always use the work branch**: Never commit directly to main/master +- **Follow quality criteria**: Use sub-agent review when quality criteria are specified +- **Ask for clarification**: If user intent is unclear, ask before proceeding ## Context Files -- Job definition: `.deepwork/jobs/update/job.yml` \ No newline at end of file +- Job definition: `.deepwork/jobs/update/job.yml` +- job instructions: `.deepwork/jobs/update/steps/job.md` diff --git a/CHANGELOG.md b/CHANGELOG.md index 9fbcd172..bcb9f806 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] ### Added +- Agent-based architecture for Claude Code skill generation + - New `agent-job.md.jinja` template generates single agent file per job with all steps embedded as skills + - Jobs now generate one agent file (e.g., `.claude/skills/job_name/SKILL.md`) instead of separate meta-skill + step skill files + - Added `supports_agent_mode` flag to adapters (enabled for Claude, disabled for Gemini) + - Added `generate_agent()` and `generate_all()` methods to SkillGenerator + - Sync command now cleans up old step skill directories when using agent mode - Explicit workflow definitions in job.yml for distinguishing multi-step workflows from standalone skills - New `workflows` section in job.yml with `name`, `summary`, and ordered `steps` array - Workflows are shown separately from standalone skills in generated meta-skills @@ -16,6 +22,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Backward compatible: jobs without `workflows` section use dependency-based detection ### Changed +- **Claude Code skill structure changed**: Jobs now generate a single agent file instead of separate meta-skill + step skill files + - Before: `.claude/skills/job_name/SKILL.md` (meta) + `.claude/skills/job_name.step_id/SKILL.md` (steps) + - After: `.claude/skills/job_name/SKILL.md` (agent with all steps embedded) + - Gemini CLI continues to use the legacy multi-file structure - Skill templates now show workflow-aware progress (e.g., "new_job step 2/3 complete") - Meta-skill template reorganized to show "Workflows" and "Standalone Skills" sections separately - Updated `deepwork_jobs` standard job to v1.0.0 with explicit `new_job` workflow diff --git a/doc/architecture.md b/doc/architecture.md index cafd92c3..679b8a5d 100644 --- a/doc/architecture.md +++ b/doc/architecture.md @@ -60,7 +60,8 @@ deepwork/ # DeepWork tool repository │ │ └── rules_check.py # Cross-platform rule evaluation hook │ ├── templates/ # Skill templates for each platform │ │ ├── claude/ -│ │ │ └── skill-job-step.md.jinja +│ │ │ ├── agent-job.md.jinja # Agent template (single file per job) +│ │ │ └── skill-job-step.md.jinja # Legacy step skill template │ │ ├── gemini/ │ │ └── copilot/ │ ├── standard_jobs/ # Built-in job definitions @@ -196,6 +197,9 @@ class ClaudeAdapter(AgentAdapter): SkillLifecycleHook.BEFORE_TOOL: "PreToolUse", SkillLifecycleHook.BEFORE_PROMPT: "UserPromptSubmit", } + + # Claude Code supports agent mode - jobs generate single agent files + supports_agent_mode = True ``` ### 3. Platform Detector (`detector.py`) @@ -226,50 +230,50 @@ class PlatformDetector: Generates AI-platform-specific skill files from job definitions. -This component is called by the `sync` command to regenerate all skills: -1. Reads the job definition from `.deepwork/jobs/[job-name]/job.yml` -2. Loads platform-specific templates -3. Generates skill files for each step in the job -4. Writes skills to the AI platform's skills directory +This component is called by the `sync` command to regenerate all skills. + +**Two Generation Modes**: + +1. **Agent Mode** (Claude Code): Generates a single agent file per job containing all steps as embedded skills. The agent file can be invoked as a slash command and handles workflow orchestration internally. + +2. **Legacy Mode** (Gemini, others): Generates a meta-skill file plus separate step skill files that invoke each other via the Skill tool. **Example Generation Flow**: ```python class SkillGenerator: - def generate_all_skills(self, job: JobDefinition, - platform: PlatformConfig, - output_dir: Path) -> list[Path]: - """Generate skill files for all steps in a job.""" - skill_paths = [] - - for step_index, step in enumerate(job.steps): - # Load step instructions - instructions = read_file(job.job_dir / step.instructions_file) - - # Build template context - context = { - "job_name": job.name, - "step_id": step.id, - "step_name": step.name, - "step_number": step_index + 1, - "total_steps": len(job.steps), - "instructions_content": instructions, - "user_inputs": [inp for inp in step.inputs if inp.is_user_input()], - "file_inputs": [inp for inp in step.inputs if inp.is_file_input()], - "outputs": step.outputs, - "dependencies": step.dependencies, - "exposed": step.exposed, - } - - # Render template - template = env.get_template("skill-job-step.md.jinja") - rendered = template.render(**context) - - # Write to platform's skills directory - skill_path = output_dir / platform.config_dir / platform.skills_dir / f"{job.name}.{step.id}.md" - write_file(skill_path, rendered) - skill_paths.append(skill_path) - - return skill_paths + def generate_all(self, job: JobDefinition, + adapter: AgentAdapter, + output_dir: Path, + use_agent_mode: bool = True) -> list[Path]: + """Generate skill files for a job. + + Args: + use_agent_mode: If True (default for Claude), generate single + agent file. If False, use legacy meta-skill + + step skills pattern. + """ + if use_agent_mode: + # Agent mode: single file containing all skills + return [self.generate_agent(job, adapter, output_dir)] + else: + # Legacy mode: meta-skill + separate step skills + return self.generate_all_skills(job, adapter, output_dir) + + def generate_agent(self, job: JobDefinition, + adapter: AgentAdapter, + output_dir: Path) -> Path: + """Generate a single agent file with all steps embedded.""" + # Build context with full step details + context = self._build_agent_context(job, adapter) + + # Render agent template + template = env.get_template("agent-job.md.jinja") + rendered = template.render(**context) + + # Write to skills directory (e.g., .claude/skills/job_name/SKILL.md) + skill_path = output_dir / adapter.skills_dir / job.name / "SKILL.md" + write_file(skill_path, rendered) + return skill_path ``` --- @@ -285,13 +289,13 @@ my-project/ # User's project (target) ├── .git/ ├── .claude/ # Claude Code directory │ ├── settings.json # Includes installed hooks -│ └── skills/ # Skill files -│ ├── deepwork_jobs.define.md # Core DeepWork skills -│ ├── deepwork_jobs.implement.md -│ ├── deepwork_jobs.refine.md -│ ├── deepwork_rules.define.md # Rule management -│ ├── competitive_research.identify_competitors.md -│ └── ... +│ └── skills/ # Skill files (agent mode: one dir per job) +│ ├── deepwork_jobs/ # Job as agent with embedded skills +│ │ └── SKILL.md +│ ├── deepwork_rules/ +│ │ └── SKILL.md +│ └── competitive_research/ +│ └── SKILL.md # Contains all steps as embedded skills ├── .deepwork/ # DeepWork configuration │ ├── config.yml # Platform config │ ├── .gitignore # Ignores tmp/ directory @@ -547,11 +551,13 @@ Create `competitors.md` with this structure: - [ ] No duplicate entries ``` -## Generated Command Files +## Generated Skill Files + +When the job is defined and `sync` is run, DeepWork generates skill files. -When the job is defined and `sync` is run, DeepWork generates command files. Example for Claude Code: +**Claude Code (Agent Mode)**: Generates a single agent file per job at `.claude/skills/[job_name]/SKILL.md` containing all steps as embedded skills. -`.deepwork/jobs/competitive_research` a step called `identify_competitors` will generate a skill file at `.claude/skills/competitive_research.identify_competitors.md`: +**Gemini CLI (Legacy Mode)**: Generates separate files - a meta-skill at `.gemini/skills/[job_name]/index.toml` plus step skills at `.gemini/skills/[job_name]/[step_id].toml`. # Part 3: Runtime Execution Model diff --git a/src/deepwork/cli/sync.py b/src/deepwork/cli/sync.py index 00588dcd..28fcb99d 100644 --- a/src/deepwork/cli/sync.py +++ b/src/deepwork/cli/sync.py @@ -1,5 +1,6 @@ """Sync command for DeepWork CLI.""" +import shutil from pathlib import Path import click @@ -9,7 +10,7 @@ from deepwork.core.adapters import AgentAdapter from deepwork.core.generator import SkillGenerator from deepwork.core.hooks_syncer import collect_job_hooks, sync_hooks_to_platform -from deepwork.core.parser import parse_job_definition +from deepwork.core.parser import JobDefinition, parse_job_definition from deepwork.utils.fs import ensure_dir from deepwork.utils.yaml_utils import load_yaml @@ -22,6 +23,39 @@ class SyncError(Exception): pass +def _cleanup_old_step_skills( + skills_dir: Path, job: JobDefinition, adapter: AgentAdapter +) -> int: + """ + Remove old step skill directories when switching to agent mode. + + In agent mode, we only generate a single agent file per job (job_name/SKILL.md). + This function removes any legacy step skill directories (job_name.step_id/). + + Args: + skills_dir: Path to the skills directory + job: Job definition + adapter: Agent adapter + + Returns: + Number of directories removed + """ + removed = 0 + for step in job.steps: + # Get the old step skill path (e.g., job_name.step_id/SKILL.md) + step_skill_filename = adapter.get_step_skill_filename(job.name, step.id, step.exposed) + step_skill_dir = skills_dir / step_skill_filename.split("/")[0] # Get directory part + + if step_skill_dir.exists() and step_skill_dir.is_dir(): + try: + shutil.rmtree(step_skill_dir) + removed += 1 + except OSError: + pass # Ignore errors during cleanup + + return removed + + @click.command() @click.option( "--path", @@ -116,7 +150,7 @@ def sync_skills(project_path: Path) -> None: # Sync each platform generator = SkillGenerator() - stats = {"platforms": 0, "skills": 0, "hooks": 0} + stats = {"platforms": 0, "skills": 0, "agents": 0, "hooks": 0} synced_adapters: list[AgentAdapter] = [] for platform_name in platforms: @@ -136,17 +170,31 @@ def sync_skills(project_path: Path) -> None: ensure_dir(skills_dir) # Generate skills for all jobs + # Use agent mode if the adapter supports it (generates single agent file per job) + # Otherwise fall back to legacy meta-skill + step skills pattern + use_agent_mode = adapter.supports_agent_mode all_skill_paths: list[Path] = [] if jobs: - console.print(" [dim]•[/dim] Generating skills...") + # Clean up old step skill files when using agent mode + if use_agent_mode: + for job in jobs: + _cleanup_old_step_skills(skills_dir, job, adapter) + + mode_label = "agents" if use_agent_mode else "skills" + console.print(f" [dim]•[/dim] Generating {mode_label}...") for job in jobs: try: - job_paths = generator.generate_all_skills( - job, adapter, platform_dir, project_root=project_path + job_paths = generator.generate_all( + job, adapter, platform_dir, project_root=project_path, + use_agent_mode=use_agent_mode ) all_skill_paths.extend(job_paths) - stats["skills"] += len(job_paths) - console.print(f" [green]✓[/green] {job.name} ({len(job_paths)} skills)") + if use_agent_mode: + stats["agents"] += 1 + console.print(f" [green]✓[/green] {job.name} (1 agent)") + else: + stats["skills"] += len(job_paths) + console.print(f" [green]✓[/green] {job.name} ({len(job_paths)} skills)") except Exception as e: console.print(f" [red]✗[/red] Failed for {job.name}: {e}") @@ -196,7 +244,10 @@ def sync_skills(project_path: Path) -> None: table.add_column("Count", style="green") table.add_row("Platforms synced", str(stats["platforms"])) - table.add_row("Total skills", str(stats["skills"])) + if stats["agents"] > 0: + table.add_row("Total agents", str(stats["agents"])) + if stats["skills"] > 0: + table.add_row("Total skills", str(stats["skills"])) if stats["hooks"] > 0: table.add_row("Hooks synced", str(stats["hooks"])) @@ -204,8 +255,9 @@ def sync_skills(project_path: Path) -> None: console.print() # Show reload instructions for each synced platform - if synced_adapters and stats["skills"] > 0: - console.print("[bold]To use the new skills:[/bold]") + if synced_adapters and (stats["skills"] > 0 or stats["agents"] > 0): + label = "agents" if stats["agents"] > 0 else "skills" + console.print(f"[bold]To use the new {label}:[/bold]") for adapter in synced_adapters: console.print(f" [cyan]{adapter.display_name}:[/cyan] {adapter.reload_instructions}") console.print() diff --git a/src/deepwork/core/adapters.py b/src/deepwork/core/adapters.py index ed6841a2..bd96302a 100644 --- a/src/deepwork/core/adapters.py +++ b/src/deepwork/core/adapters.py @@ -66,6 +66,10 @@ class AgentAdapter(ABC): # Subclasses should override this to provide platform-specific mappings. hook_name_mapping: ClassVar[dict[SkillLifecycleHook, str]] = {} + # Whether this adapter supports agent mode (single agent file with embedded skills). + # If False, falls back to legacy meta-skill + step skills pattern. + supports_agent_mode: ClassVar[bool] = False + def __init__(self, project_root: Path | str | None = None): """ Initialize adapter with optional project root. @@ -309,6 +313,9 @@ class ClaudeAdapter(AgentAdapter): SkillLifecycleHook.BEFORE_PROMPT: "UserPromptSubmit", } + # Claude Code supports agent mode - jobs become agents with embedded skills + supports_agent_mode: ClassVar[bool] = True + def sync_hooks(self, project_path: Path, hooks: dict[str, list[dict[str, Any]]]) -> int: """ Sync hooks to Claude Code settings.json. diff --git a/src/deepwork/core/generator.py b/src/deepwork/core/generator.py index 859b7070..0a29b80c 100644 --- a/src/deepwork/core/generator.py +++ b/src/deepwork/core/generator.py @@ -23,7 +23,17 @@ class GeneratorError(Exception): class SkillGenerator: - """Generates skill files from job definitions.""" + """Generates skill files from job definitions. + + Supports two generation modes: + 1. Agent mode (default): Generates a single agent file containing the job + with all steps as embedded skills. The agent can be spawned via Task tool. + 2. Legacy mode: Generates separate meta-skill and step skill files that use + the Skill tool to invoke each other. + """ + + # Template names for agent generation + AGENT_TEMPLATE = "agent-job.md.jinja" def __init__(self, templates_dir: Path | str | None = None): """ @@ -550,3 +560,182 @@ def generate_all_skills( skill_paths.append(skill_path) return skill_paths + + def _build_agent_context( + self, + job: JobDefinition, + adapter: AgentAdapter, + project_root: Path | None = None, + ) -> dict[str, Any]: + """ + Build template context for an agent file. + + The agent context includes full step details with instructions embedded, + unlike the meta-skill context which only includes step metadata. + + Args: + job: Job definition + adapter: Agent adapter for platform-specific configuration + project_root: Optional project root for loading doc specs + + Returns: + Template context dictionary with full step details + """ + # Build full step info with instructions for the agent + steps_info = [] + for step_index, step in enumerate(job.steps): + # Build full step context + step_context = self._build_step_context( + job, step, step_index, adapter, project_root + ) + + # Get workflow info for step + workflow = job.get_workflow_for_step(step.id) + + step_info = { + "id": step.id, + "name": step.name, + "description": step.description, + "instructions_file": step.instructions_file, + "instructions_content": step_context["instructions_content"], + "user_inputs": step_context["user_inputs"], + "file_inputs": step_context["file_inputs"], + "outputs": step_context["outputs"], + "dependencies": step.dependencies, + "exposed": step.exposed, + "is_standalone": self._is_standalone_step(job, step), + "quality_criteria": step.quality_criteria, + "next_step": step_context.get("next_step"), + "prev_step": step_context.get("prev_step"), + } + + # Add workflow context if applicable + if workflow: + step_info["workflow_name"] = workflow.name + step_info["workflow_step_number"] = step_context.get("workflow_step_number") + step_info["workflow_total_steps"] = step_context.get("workflow_total_steps") + + steps_info.append(step_info) + + # Build workflow info + workflows_info = [] + for workflow in job.workflows: + workflows_info.append( + { + "name": workflow.name, + "summary": workflow.summary, + "steps": workflow.steps, + "first_step": workflow.steps[0] if workflow.steps else None, + } + ) + + # Identify standalone steps (not in any workflow) + standalone_steps = [s for s in steps_info if s["is_standalone"]] + + return { + "job_name": job.name, + "job_version": job.version, + "job_summary": job.summary, + "job_description": job.description, + "total_steps": len(job.steps), + "steps": steps_info, + "workflows": workflows_info, + "standalone_steps": standalone_steps, + "has_workflows": bool(job.workflows), + } + + def generate_agent( + self, + job: JobDefinition, + adapter: AgentAdapter, + output_dir: Path | str, + project_root: Path | str | None = None, + ) -> Path: + """ + Generate an agent file for a job. + + The agent file contains the job as an agent with all steps embedded + as skills. This is the new pattern replacing the meta-skill + step skills + approach. + + Args: + job: Job definition + adapter: Agent adapter for the target platform + output_dir: Directory to write agent file to + project_root: Optional project root for loading doc specs + + Returns: + Path to generated agent file + + Raises: + GeneratorError: If generation fails + """ + output_dir = Path(output_dir) + project_root_path = Path(project_root) if project_root else output_dir + + # Create skills subdirectory if needed + skills_dir = output_dir / adapter.skills_dir + skills_dir.mkdir(parents=True, exist_ok=True) + + # Build context with full step details + context = self._build_agent_context(job, adapter, project_root_path) + + # Load and render agent template + env = self._get_jinja_env(adapter) + try: + template = env.get_template(self.AGENT_TEMPLATE) + except TemplateNotFound as e: + raise GeneratorError(f"Agent template not found: {e}") from e + + try: + rendered = template.render(**context) + except Exception as e: + raise GeneratorError(f"Agent template rendering failed: {e}") from e + + # Write agent file using meta-skill filename (job_name/SKILL.md) + # The agent replaces the meta-skill as the job's entry point + agent_filename = adapter.get_meta_skill_filename(job.name) + agent_path = skills_dir / agent_filename + + # Ensure parent directories exist + agent_path.parent.mkdir(parents=True, exist_ok=True) + + try: + safe_write(agent_path, rendered) + except Exception as e: + raise GeneratorError(f"Failed to write agent file: {e}") from e + + return agent_path + + def generate_all( + self, + job: JobDefinition, + adapter: AgentAdapter, + output_dir: Path | str, + project_root: Path | str | None = None, + use_agent_mode: bool = True, + ) -> list[Path]: + """ + Generate all files for a job. + + Args: + job: Job definition + adapter: Agent adapter for the target platform + output_dir: Directory to write files to + project_root: Optional project root for loading doc specs + use_agent_mode: If True (default), generate single agent file. + If False, use legacy meta-skill + step skills pattern. + + Returns: + List of paths to generated files + + Raises: + GeneratorError: If generation fails + """ + if use_agent_mode: + # Agent mode: single agent file containing all skills + agent_path = self.generate_agent(job, adapter, output_dir, project_root) + return [agent_path] + else: + # Legacy mode: meta-skill + separate step skills + return self.generate_all_skills(job, adapter, output_dir, project_root) diff --git a/src/deepwork/templates/claude/agent-job.md.jinja b/src/deepwork/templates/claude/agent-job.md.jinja new file mode 100644 index 00000000..5d5fadd5 --- /dev/null +++ b/src/deepwork/templates/claude/agent-job.md.jinja @@ -0,0 +1,243 @@ +{# +Template: agent-job.md.jinja +Purpose: Generates an agent definition file for Claude Code that contains the job as an agent + with all steps as embedded skills. This replaces the meta-skill pattern. + +The agent can be spawned via the Task tool or invoked as a slash command. + +Template Variables: + - job_name: string - Job identifier (e.g., "competitive_research") + - job_summary: string - Short one-line summary of the job + - job_description: string|null - Full description (optional) + - total_steps: int - Number of steps in the job + - has_workflows: bool - True if workflows are defined + - workflows: list - Array of workflow objects: + - name: string - Workflow identifier + - summary: string - Short description of workflow + - steps: list[string] - Ordered list of step IDs + - first_step: string - First step ID to start workflow + - standalone_steps: list - Steps not in any workflow + - steps: list - Array of step objects with full context: + - id: string - Step identifier + - name: string - Human-readable step name + - description: string - What the step does + - instructions_content: string - Full instructions markdown + - user_inputs: list - User parameters to gather + - file_inputs: list - Files from previous steps + - outputs: list - Output specifications + - dependencies: list[string] - Required prior step IDs + - is_standalone: bool - True if not in any workflow + - workflow_name: string|null - Name of workflow if in one + - workflow_step_number: int|null - Position in workflow + - workflow_total_steps: int|null - Total steps in workflow + - next_step: string|null - Next step in workflow + - quality_criteria: list[string]|null - Criteria for completion +#} +{#- ==================== MACROS ==================== -#} + +{#- Macro: Render output file list as comma-separated string -#} +{% macro output_files(outputs) -%} +{{ outputs | map(attribute='file') | join(', ') }} +{%- endmacro %} + +{#- Macro: Render workflow as "name: summary (steps)" -#} +{% macro workflow_desc(workflow) -%} +**{{ workflow.name }}**: {{ workflow.summary }} ({{ workflow.steps | join(' → ') }}) +{%- endmacro %} + +{#- Macro: Render step completion message -#} +{% macro completion_msg(step, job_name) -%} +{% if step.is_standalone -%} +"{{ step.id }} complete{% if step.outputs %}, outputs: {{ output_files(step.outputs) }}{% endif %}" +{%- elif step.next_step -%} +"{{ step.id }} complete{% if step.outputs %}, outputs: {{ output_files(step.outputs) }}{% endif %}" +{%- else -%} +"{{ step.workflow_name if step.workflow_name else job_name }} workflow complete{% if step.outputs %}, outputs: {{ output_files(step.outputs) }}{% endif %}" +{%- endif %} +{%- endmacro %} + +{#- ==================== TEMPLATE ==================== -#} +--- +name: {{ job_name }} +description: "{{ job_summary }}" +--- + +# {{ job_name }} Agent + +{{ job_summary }} + +{% if job_description %} +{{ job_description }} +{% endif %} + +## Agent Overview + +This agent handles the **{{ job_name }}** job with {{ total_steps }} skill{{ 's' if total_steps != 1 else '' }}. + +{% if has_workflows %} +**Workflows**: {% for workflow in workflows %}{{ workflow.name }}{% if not loop.last %}, {% endif %}{% endfor %} + +{% endif %} +{% if standalone_steps %} +**Standalone Skills**: {% for step in standalone_steps %}{{ step.id }}{% if not loop.last %}, {% endif %}{% endfor %} + +{% endif %} +--- + +## How to Use This Agent + +{% if has_workflows %} +### Workflows +{% for workflow in workflows %} +- {{ workflow_desc(workflow) }} + - Start: `{{ workflow.first_step }}` +{% endfor %} + +{% endif %} +{% if standalone_steps %} +### Standalone Skills (run anytime) +{% for step in standalone_steps %} +- **{{ step.id }}**: {{ step.description }} +{% endfor %} + +{% endif %} +### All Skills +{% for step in steps %} +- `{{ step.id }}` - {{ step.description }} +{% endfor %} + +--- + +## Agent Execution Instructions + +When invoked, follow these steps: + +### Step 1: Understand Intent + +Parse the user's request to determine: +1. Which workflow or skill to execute +2. Any parameters or context provided +3. Whether this is a continuation of previous work + +### Step 2: Check Work Branch + +Before executing any skill: +1. Check current git branch +2. If on a `deepwork/{{ job_name }}-*` branch: continue using it +3. If on main/master: create new branch `deepwork/{{ job_name }}-[instance]-$(date +%Y%m%d)` + +### Step 3: Execute the Appropriate Skill + +Navigate to the relevant skill section below and follow its instructions. + +### Step 4: Workflow Continuation + +After completing a workflow step: +1. Inform the user of completion and outputs created +2. Automatically proceed to the next step if one exists +3. Continue until the workflow is complete or the user intervenes + +--- + +## Skills +{% for step in steps %} + +### Skill: {{ step.id }} + +{% if step.is_standalone %} +**Type**: Standalone (can be run anytime) +{% elif step.workflow_name %} +**Type**: Workflow step {{ step.workflow_step_number }}/{{ step.workflow_total_steps }} in **{{ step.workflow_name }}** +{% endif %} + +**Description**: {{ step.description }} + +{% if step.dependencies %} +#### Prerequisites + +Before running this skill, ensure these are complete: +{% for dep in step.dependencies %} +- `{{ dep }}` +{% endfor %} +{% endif %} + +{% if step.user_inputs %} +#### Required User Input + +Gather these from the user before starting: +{% for input in step.user_inputs %} +- **{{ input.name }}**: {{ input.description }} +{% endfor %} +{% endif %} + +{% if step.file_inputs %} +#### Input Files + +Read these files (from previous steps): +{% for input in step.file_inputs %} +- `{{ input.file }}` (from `{{ input.from_step }}`) +{% endfor %} +{% endif %} + +#### Instructions + +{{ step.instructions_content }} + +{% if step.outputs %} +#### Outputs + +Create these files/directories: +{% for output in step.outputs %} +- `{{ output.file }}`{% if output.file.endswith('/') %} (directory){% endif %} +{% if output.has_doc_spec and output.doc_spec %} + + **Doc Spec**: {{ output.doc_spec.name }} + > {{ output.doc_spec.description }} +{% if output.doc_spec.quality_criteria %} + **Quality Criteria**: +{% for criterion in output.doc_spec.quality_criteria %} + - **{{ criterion.name }}**: {{ criterion.description }} +{% endfor %} +{% endif %} +{% endif %} +{% endfor %} +{% endif %} + +{% if step.quality_criteria %} +#### Quality Validation + +Before completing this skill, verify: +{% for criterion in step.quality_criteria %} +{{ loop.index }}. {{ criterion }} +{% endfor %} + +Use a sub-agent (Haiku model) to review your work against these criteria before marking complete. +{% endif %} + +#### On Completion + +1. Verify outputs are created +2. Inform user: {{ completion_msg(step, job_name) }} +{% if step.next_step %} +3. **Continue to next skill**: Proceed to `{{ step.next_step }}` +{% elif not step.is_standalone %} +3. Consider creating a PR to merge the work branch +{% endif %} + +--- +{% endfor %} + +## Guardrails + +- **Never skip prerequisites**: Always verify required steps are complete before running a skill +- **Never produce partial outputs**: Complete all required outputs before marking a skill done +- **Always use the work branch**: Never commit directly to main/master +- **Follow quality criteria**: Use sub-agent review when quality criteria are specified +- **Ask for clarification**: If user intent is unclear, ask before proceeding + +## Context Files + +- Job definition: `.deepwork/jobs/{{ job_name }}/job.yml` +{% for step in steps %} +- {{ step.id }} instructions: `.deepwork/jobs/{{ job_name }}/{{ step.instructions_file }}` +{% endfor %} diff --git a/tests/integration/test_install_flow.py b/tests/integration/test_install_flow.py index d0638275..80fc73ba 100644 --- a/tests/integration/test_install_flow.py +++ b/tests/integration/test_install_flow.py @@ -39,25 +39,23 @@ def test_install_with_claude(self, mock_claude_project: Path) -> None: assert config is not None assert "claude" in config["platforms"] - # Verify core skills were created (directory/SKILL.md format) + # Verify agent file was created (agent mode: single file per job) claude_dir = mock_claude_project / ".claude" / "skills" - # Meta-skill + # Agent file (contains all skills embedded) assert (claude_dir / "deepwork_jobs" / "SKILL.md").exists() - # Step skill (no prefix, but has user-invocable: false in frontmatter) - assert (claude_dir / "deepwork_jobs.define" / "SKILL.md").exists() - # Exposed step skill (user-invocable - learn has exposed: true) - assert (claude_dir / "deepwork_jobs.learn" / "SKILL.md").exists() - - # Verify meta-skill content - meta_skill = (claude_dir / "deepwork_jobs" / "SKILL.md").read_text() - assert "# deepwork_jobs" in meta_skill - # deepwork_jobs has workflows defined, so it shows "Workflows" instead of "Available Steps" - assert "Workflows" in meta_skill or "Available Steps" in meta_skill - - # Verify step skill content - define_skill = (claude_dir / "deepwork_jobs.define" / "SKILL.md").read_text() - assert "# deepwork_jobs.define" in define_skill - assert "Define Job Specification" in define_skill + # In agent mode, step skill directories should NOT exist + assert not (claude_dir / "deepwork_jobs.define" / "SKILL.md").exists() + assert not (claude_dir / "deepwork_jobs.learn" / "SKILL.md").exists() + + # Verify agent content includes all skills embedded + agent_content = (claude_dir / "deepwork_jobs" / "SKILL.md").read_text() + assert "# deepwork_jobs Agent" in agent_content + # deepwork_jobs has workflows defined + assert "Available Workflows" in agent_content or "Workflows" in agent_content + # Agent should have skills section with all steps embedded + assert "## Skills" in agent_content + assert "### Skill: define" in agent_content + assert "### Skill: learn" in agent_content def test_install_with_auto_detect(self, mock_claude_project: Path) -> None: """Test installing with auto-detection.""" @@ -134,11 +132,12 @@ def test_install_with_multiple_platforms_auto_detect( # Verify skills were created for both platforms claude_dir = mock_multi_platform_project / ".claude" / "skills" - # Meta-skill and step skills (directory/SKILL.md format) + # Claude uses agent mode: single agent file per job assert (claude_dir / "deepwork_jobs" / "SKILL.md").exists() - assert (claude_dir / "deepwork_jobs.define" / "SKILL.md").exists() + # In agent mode, step skill directories should NOT exist for Claude + assert not (claude_dir / "deepwork_jobs.define" / "SKILL.md").exists() - # Gemini uses job_name/step_id.toml structure + # Gemini uses legacy mode (doesn't support agent mode): job_name/step_id.toml structure gemini_dir = mock_multi_platform_project / ".gemini" / "skills" # Meta-skill (index.toml) and step skills assert (gemini_dir / "deepwork_jobs" / "index.toml").exists() @@ -181,10 +180,11 @@ def test_install_is_idempotent(self, mock_claude_project: Path) -> None: assert (deepwork_dir / "config.yml").exists() claude_dir = mock_claude_project / ".claude" / "skills" - # Meta-skill and step skills (directory/SKILL.md format) + # Agent file (agent mode: single file per job) assert (claude_dir / "deepwork_jobs" / "SKILL.md").exists() - assert (claude_dir / "deepwork_jobs.define" / "SKILL.md").exists() - assert (claude_dir / "deepwork_jobs.learn" / "SKILL.md").exists() + # In agent mode, step skill directories should NOT exist + assert not (claude_dir / "deepwork_jobs.define" / "SKILL.md").exists() + assert not (claude_dir / "deepwork_jobs.learn" / "SKILL.md").exists() def test_install_creates_rules_directory(self, mock_claude_project: Path) -> None: """Test that install creates the v2 rules directory with example templates."""