Feat: Adopt TypeScript Template Literals for Custom Evaluator Prompts #56

christso · 2025-12-07T23:25:20Z

Summary

Add definePromptTemplate SDK wrapper to @agentv/eval that mirrors defineCodeJudge. Update orchestrator to execute .ts/.js prompt files as subprocesses.

Changes

SDK (@agentv/eval):

Added PromptTemplateInputSchema in packages/eval/src/schemas.ts
Created definePromptTemplate wrapper in packages/eval/src/prompt-template.ts
Exported new types: definePromptTemplate, PromptTemplateInput, PromptTemplateInputSchema, PromptTemplateHandler

Core (@agentv/core):

Updated LlmJudgeEvaluatorConfig to include config and resolvedPromptPath
Added executePromptTemplate function in orchestrator.ts
Updated resolveCustomPrompt to detect and execute .ts/.js files
Updated evaluator parser to pass through config and skip validation for executable prompts

Testing:

Added unit tests for PromptTemplateInputSchema (14 tests)
Added integration tests for executable prompt templates (3 tests)

Example:

Created examples/features/prompt-template-sdk/ with TypeScript prompt template example

Test plan

bun run build passes
bun run typecheck passes
bun run lint passes
bun test passes (369 tests)

🤖 Generated with Claude Code

…t templates Add TypeScript/JavaScript support for custom evaluator prompts using the same subprocess pattern as code judges. Changes: - Add PromptTemplateInputSchema and definePromptTemplate to @agentv/eval - Update orchestrator to execute .ts/.js prompt files as subprocesses - Add config and resolvedPromptPath to LlmJudgeEvaluatorConfig - Skip validation for executable prompt templates in evaluator parser - Add unit tests for PromptTemplateInputSchema - Add integration tests for executable prompt templates - Add example in examples/features/prompt-template-sdk/ Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Move OpenSpec to archive after implementation is complete. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Archive change to openspec/changes/archive/2026-01-28-adopt-ts-template-prompts/ - Create new spec openspec/specs/custom-evaluator-prompts/ Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Reuse CodeJudgeInputSchema for PromptTemplateInputSchema (consistent payloads) - Add timeout support to executePromptTemplate (prevents hanging scripts) - Validate non-empty output from prompt templates - Throw error for missing .ts/.js prompt template files (fail-fast) - Update tests to reflect required fields Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…lates Change executable prompt templates to use explicit script arrays instead of auto-detecting runtime by file extension. This matches the code_judge pattern for consistency. Before: prompt: ../prompts/custom-evaluator.ts # ambiguous runtime After: prompt: script: [bun, run, ../prompts/custom-evaluator.ts] config: { ... } Benefits: - Consistent with code_judge pattern (one mental model) - No ambiguity about runtime (user explicitly specifies bun/node/python) - Future-proof (works with any runtime without code changes) - Aligns with "Built-ins for Primitives Only" design principle Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Maintains backward compatibility for users who were using --eval-id. Shows deprecation warning when used. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Reverts to --eval-id as the primary flag for filtering eval cases. This aligns with Jest/Vitest convention (--testNamePattern) where the flag name describes what is being filtered, not the action. Removes --filter alias to keep the CLI simple and match existing docs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Document the new definePromptTemplate SDK for creating dynamic LLM judge prompts with TypeScript. Includes YAML configuration example and available context fields. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

christso marked this pull request as draft December 7, 2025 23:25

christso force-pushed the main branch from 57cda0c to 5276006 Compare January 4, 2026 01:56

christso added the enhancement New feature or request label Jan 13, 2026

christso force-pushed the feat/ts-prompt-templates branch from 4e912bd to 40e8fc6 Compare January 28, 2026 07:24

christso marked this pull request as ready for review January 28, 2026 07:25

christso and others added 8 commits January 28, 2026 07:31

chore: archive adopt-ts-template-prompts openspec

303f5db

Move OpenSpec to archive after implementation is complete. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

chore: archive adopt-ts-template-prompts openspec

25d23fa

- Archive change to openspec/changes/archive/2026-01-28-adopt-ts-template-prompts/ - Create new spec openspec/specs/custom-evaluator-prompts/ Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix(examples): use default target for prompt-template-sdk example

6f1a54c

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

feat(cli): add --eval-id as deprecated alias for --filter

48140e2

Maintains backward compatibility for users who were using --eval-id. Shows deprecation warning when used. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

docs(skills): add TypeScript prompt template documentation

daa8ccb

Document the new definePromptTemplate SDK for creating dynamic LLM judge prompts with TypeScript. Includes YAML configuration example and available context fields. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

christso merged commit de3808d into main Jan 28, 2026

christso deleted the feat/ts-prompt-templates branch January 28, 2026 12:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Adopt TypeScript Template Literals for Custom Evaluator Prompts #56

Feat: Adopt TypeScript Template Literals for Custom Evaluator Prompts #56

Uh oh!

christso commented Dec 7, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feat: Adopt TypeScript Template Literals for Custom Evaluator Prompts #56

Feat: Adopt TypeScript Template Literals for Custom Evaluator Prompts #56

Uh oh!

Conversation

christso commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

christso commented Dec 7, 2025 •

edited

Loading